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PREFACE 


This  book  is  not  a textbook  in  the  ordinary  sense  of  the  word 
but  rather  a reader  in  the  mathematical  sciences.  Using  simple 
examples  taken  from  physics  and  a variety  of  mathematical  problems, 
we  have  tried  to  introduce  the  reader  to  a broad  range  of  ideas  and 
methods  that  are  found  in  present-day  applications  of  mathematics  to 
physics,  engineering  and  other  fields.  Some  of  these  ideas  and  methods 
(such  as  the  use  of  the  delta  function,  the  principle  of  superposition, 
obtaining  asymptotic  expressions,  etc.)  have  not  been  sufficiently 
discussed  in  the  ordinary  run  of  mathematics  textbooks  for  non-mathe- 
maticians, and  so  this  text  can  serve  as  a supplement  to  such  textbooks. 
Our  aim  has  been  to  elucidate  the  basic  ideas  of  mathematical  methods 
and  the  general  laws  of  the  phenomena  at  hand.  Formal  proofs,  excep- 
tions and  complicating  factors  have  for  the  most  part  been  dropped. 
Instead  we  have  strived  in  certain  places  to  go  deeper  into  the  physical 
picture  of  the  processes. 

It  is  assumed  that  the  reader  has  a grasp  of  the  basic  essentials 
of  differential  and  integral  calculus  for  functions  of  one  variable, 
including  the  expansion  of  such  functions  in  power  series,  and  is  able 
to  use  such  knowledge  in  the  solution  of  physical  problems.  It  is  suffi- 
cient (but  not  necessary!)  to  be  acquainted  with  Zeldovich's  Higher 
Mathematics  for  Beginners  (Mir  Publishers,  Moscow,  1973),  which 
we  will  refer  to  throughout  this  text  as  HM.  What  is  more,  the  present 
text  may  be  regarded  as  a continuation  of  Higher  Mathematics  for 
Beginners . In  some  places  the  material  is  closely  related  to  Myskis' 
Lectures  in  Higher  Mathematics  (Mir  Publishers,  Moscow,  1972). 
Nevertheless,  this  text  is  completely  independent  and  requires  of  the 
reader  only  the  background  indicated  above.  Nothing  else. 

The  range  of  this  book  is  clear  from  the  table  of  contents.  The 
reader  need  not  study  it  chapter  by  chapter  but  may,  in  accordance 
with  his  particular  interests,  dip  into  any  chapter,  taking  care  to 
follow  up  any  references  to  other  sections  of  the  book.  For  the  sake  of 
convenience,  material  needed  in  other  parts  of  the  book  is  usually 
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indicated  at  the  beginning  of  the  chapter.  Sections  and  formulas  are 
numbered  separately  in  each  chapter,  and,  within  the  chapter,  refer- 
ences to  formulas  are  given  without  the  number  of  the  chapter. 

This  text  should  be  useful  to  students  majoring  in  engineering, 
physics  and  other  fields  as  well,  starting  roughly  in  the  latter  half 
of  the  first  year  of  study  as  a supplementary  mathematics  text.  It  is 
also  suitable  for  the  practicing  engineer  and  specialist  if  he  wishes  to 
look  into  some  aspect  of  mathematics  that  may  be  of  use  in  his  work. 

It  is  undoubtedly  difficult  to  read  such  diversified  material  in  an 
active  fashion.  The  reader  with  a specific  purpose  in  mind  will  ordinarily 
devote  his  whole  attention  to  the  matter  at  hand  and  skip  through 
any  other  material  that  does  not  pertain  to  his  problem. 

The  question  arises  of  whether  there  is  any  purpose  in  reading  right 
through  the  book  with  the  aim  of  storing  up  useful  knowledge  for  the 
future.  Would  not  such  an  approach  make  for  superficial  reading  that 
would  leave  only  hazy  impressions  in  the  end? 

The  authors  take  the  following  view. 

Firstly,  we  have  always  attempted  to  give  not  only  practical  pro- 
cedures, but  also  the  underlying  general  ideas  as  well.  Such  ideas  enrich 
the  reader,  expand  his  scientific  horizon  and  enable  him  to  take  a dif- 
ferent look  of  the  surrounding  world,  all  of  which  produces  a more  pro- 
found and  lasting  impression. 

The  second  argument  in  favour  of  this  approach  has  to  do  with 
the  psychology  of  memory.  The  reader  may  not  be  able  to  reproduce 
all  the  material  he  has  gone  through,  but  traces  are  definitely  left  in 
his  memory.  Some  of  it  will  come  to  mind  when  brought  into  asso- 
ciation with  other,  familiar,  things;  for  example,  when  he  encounters 
a problem  that  requires  just  such  a procedure  or  mathematical  device. 
Even  a very  hazy  recollection  of  where  such  a device  may  be  found 
proves  to  be  useful.*  To  put  it  crudely,  there  are  things,  concepts  and 
relationships  which  we  recall  only  when  they  are  needed. 

So  our  advice  is:  read  our  book  and  study  it.  But  even  if  there 
is  not  time  enough  to  make  a deep  study,  then  merely  read  it  like 
a novel,  and  it  may  be  just  the  thing  you  will  need  for  solving  some 
difficult  problem. 


Following  Freud,  we  might  say  that  forgotten  theorems  remain  hidden  in  the 
subconscious,  alongside  the  repressions  of  early  childhood. 
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Never  fear  difficulties.  It  is  only  in  grappling  with  hard  problems 
that  a person  discovers  latent  talent  and  exhibits  the  best  he  has. 

We  will  be  grateful  to  our  readers  for  any  remarks  and  criticism 
they  have  with  respect  to  the  contents  of  this  text  and  the  exposition 
of  the  material.  Quite  naturally,  some  parts  of  the  book  reflect  the 
different  habits  and  interests  of  the  authors,  one  of  whom  is  a physicist 
and  the  other  a mathematician.  At  times  we  pulled  in  different  direc- 
tions, and  if  to  this  one  adds  the  efforts  of  the  reader,  then  the  overall 
effect  should  follow  the  law  of  addition  of  vector  forces  (see  Ch.  9). 
One  of  Krylov's  fables  dealt  with  a similar  situation,  but  we  hope  that 
the  results  in  our  case  will  be  better. 

The  main  purpose  of  this  book  is  to  provide  the  reader  with  methods 
and  information  of  practical  utility,  while  simplifying  in  the  extreme 
all  formal  definitions  and  proofs. 

Chapter  15  is  devoted  to  electronic  digital  computers  and  describes 
the  operating  principles  of  the  machines  and  the  programming  of 
elementary  problems.  This  chapter  is  not  designed  to  take  the  place 
of  specialized  texts  for  programmers,  but  it  does  enable  one  to  grasp 
the  general  principles  and  specific  nature  of  numerical  computations. 

Chapters  9 and  1 1 give  detailed  definitions  of  vectors  and  tensors, 
their  relationships  with  linear  transformations,  and  examine  quite 
thoroughly  symmetric  and  antisymmetric  tensors.  Here  we  introduce 
the  pseudovector,  which  is  the  equivalent  of  the  antisymmetric  tensor 
in  three-dimensional  space.  In  this  latest  edition,  physical  problems 
have  been  added  on  the  motion  of  a particle  in  a central-force  field  and 
on  the  rotation  of  a rigid  body.  These  problems  have  always  been 
classical  examples  of  the  application  of  the  theory  of  ordinary  diffe- 
rential equations  to  physics. 

In  connection  with  the  theory  of  spectral  decomposition  we  have 
considered  the  problem  of  the  phase  of  Fourier  components;  the  loss 
of  information  in  transitions  to  spectral  density  has  become  custom- 
ary; it  is  well  to  recall  that  besides  the  square  of  the  modulus,  oscil- 
lations are  also  described  by  the  phase. 

These  and  other  additions,  it  is  hoped,  will  make  the  text  more 
physical  and  somewhat  more  modern  without  altering  its  overall 
purpose  and  accessibility. 


Chapter  1 

CERTAIN  NUMERICAL  METHODS 


The  solution  of  a physical  problem  that  has 
been  obtained  in  mathematical  terms,  such  as 
combinations  of  various  functions,  derivatives, 
integrals  and  the  like,  must  be  “brought  to  the 
number  stage”,  which  serves  as  the  ultimate 
answer  in  most  cases.  For  this  purpose,  the  va- 
rious divisions  of  mathematics  have  worked  out 
a diversity  of  numerical  methods.  In  elemen- 
tary mathematics,  as  a rule,  one  considers  only 
methods  of  the  exact  solution  of  problems : the 
solution  of  equations,  geometrical  constructions, 
and  the  like.  This  is  a weakness  since  such  solu- 
tions are  possible  only  in  very  rare  instances, 
which  naturally  reduces  drastically  the  range  of  admissible  problems, 
and  frequently  turn  out  to  be  extremely  unwieldy.  Even  such  a rela- 
tively simple  problem  as  the  solution  of  general  algebraic  equations 
of  the  nth  degree  (with  arbitrary  coefficients)  has  proved,  in  elementary 
mathematics  when  n > 2,  to  be  unbelievably  complicated  and  cum- 
bersome, whereas  for  n > 4 it  is  simply  impossible.  Only  the  system- 
atic application  of  the  methods  of  approximate  calculations  based  on 
the  apparatus  of  higher  mathematics  has  made  it  possible  to  bring 
to  completion,  and,  what  is  more,  in  a unified  manner,  the  solution 
of  a broad  class  of  mathematical  problems  that  find  important  appli- 
cations. Moreover,  the  development  of  numerical  methods  of  higher 
mathematics  and  the  introduction  of  modern  computing  machines 
have  resulted  in  the  following:  if  a problem  has  been  stated  in  suffi- 
ciently clear-cut  mathematical  terms,  then  (with  the  exception  of 
particularly  intricate  cases)  it  will  definitely  be  solved  with  an  accuracy 
that  satisfies  practical  needs.  Thus,  higher  mathematics  not  only 
supplies  the  ideas  underlying  an  analysis  of  physical  phenomena,  but 
also  the  numerical  methods  that  permit  carrying  the  solution  of  spe- 
cific problems  of  physics  and  engineering  to  completion. 

Some  of  these  methods  are  given  in  the  first  course  of  differential 
and  integral  calculus.  For  instance,  the  most  elementary  methods  of 
computing  derivatives  and  integrals,  evaluating  functions  by  means 
of  series,  and  so  forth.  In  this  and  subsequent  chapters  (in  particular, 
Chs.  2,  3,  8)  we  examine  these  methods  in  detail.  They  are  not  directly 
connected  in  any  way  and  so  they  can  be  read  independently. 
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1.1  Numerical  integration 

When  a practical  problem  has  been  reduced  to  evaluating  a defi- 
nite integral,  we  can  say  that  the  most  difficult  part  of  the  matter  is 
over.  If  the  integrand  f(x)  is  such  that  it  is  possible  to  express  the 
indefinite  integral  F{x)  by  means  of  a finite  number  of  elementary 
functions,  then  in  principle  it  is  rather  simple  to  obtain  the  value  of 

b 

the  definite  integral  via  the  formula  ^f(x)dx  = F(b)  —F(a).  This  in- 

a 

volves  carrying  out  a number  of  arithmetical  operations  to  find 
the  values  of  F(b)  and  F(a).  In  practice,  however,  this  can  lead  to 
considerable  complications  since  a very  cumbersome  formula  may 
result  for  the  indefinite  integral  F(x). 

This  procedure  may  prove  to  be  unsuitable  altogether  if,  as 
frequently  happens,  it  is  impossible  to  obtain  a formula  for  the  inde- 
finite integral. 

It  sometimes  happens  that  the  integral  is  expressible  in  terms 
of  nonelementary  but  well-studied  functions  for  which  tables  of 
values  have  been  compiled  (see,  for  example,  Tables  of  Higher 
Functions  by  Jahnke,  Emde,  and  Losch  [9]).  Such,  for  instance,  are 
the  functions 

dx,  ^ sm  x dxy  ^ -~s  x dx,  ^ — dx,  ^ sin  x2  dx,  ^ cos  x2  dx 


and  so  forth.  These  integrals,  which  cannot  be  expressed  in  terms  of 
elementary  functions,  even  have  special  names:  error  function  (see 
Ch.  13),  the  sine  integral,  etc. 

In  other  cases  the  integral  can  be  evaluated  by  expanding  the 
integrand  in  a series  of  one  kind  or  another.  For  example,  for  the 
integral 

I=^™JLdx 

0 


the  appropriate  indefinite  integral  cannot  be  expressed  in  terms  of 
elementary  functions.  Nevertheless,  by  taking  advantage  of  the  Taylor 
series  for  the  sine, 


. X3  .Xs  X1  I 

sm  x = x f- ... 

3!  5!  7! 


we  obtain 


i 


0 


1 


5-5! 


1 

7 • 7! 


+ ... 


Computing  the  successive  terms  of  this  series,  we  stop  when  the  desired 
reasonable  degree  of  accuracy  has  been  attained.  For  example,  comput- 
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Fig.  1 


ing  to  three  decimal  places  we  can  confine  ourselves  to  the  first  three 
terms  of  the  series  and  obtain  the  value  I = 0.946. 

Some  definite  integrals  can  be  evaluated  precisely  with  the  aid 
of  methods  of  the  theory  of  functions  of  a complex  variable.  For  in- 
stance, in  Sec.  5.9  we  will  show  that 

j*  cos  to#  > o)  (1) 

— 00 

although  the  corresponding  indefinite  integral  cannot  be  expressed 
in  terms  of  elementary  functions. 

And  if  even  such  methods  prove  unsuitable,  then  the  definite 
integral  can  be  evaluated  numerically  with  the  aid  of  formulas  of 
numerical  integration,  which  we  now  discuss. 

A simple  and  good  method  (discussed  in  HM,  Sec.  2.7)  consists 
in  the  following : the  interval  of  integration  is  partitioned  into  several 
small  equal  subintervals.  The  integral  over  each  subinterval  is  roughly 
equal  to  the  product  of  the  length  of  the  subinterval  by  the  arithmetic 
mean  of  the  values  of  the  integrand  at  the  endpoints  of  the  subinterval. 
This  method  is  called  the  trapezoidal  rule  because  it  actually  amounts 
to  replacing  the  arc  in  each  subinterval  of  the  curve  y = f{x)  by  its 
chord,  and  the  area  under  the  arc  (the  value  of  the  integral)  is  replaced 
by  the  area  of  the  resulting  trapezoid  with  vertical  bases  (Fig.  1). 
The  appropriate  formula  is  (but  verify  this) 

^ydxa  j y°  J--  +y!  + y2+  ...  + J'n-x)  (2) 

a 

where  f(xt)  = yx  for  the  sake  of  brevity. 
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A still  more  effective  formula  can  be  obtained  if  the  curve  y = f{x) 
over  a subinterval  is  replaced  by  a parabola,  that  is  to  say,  the  graph 
of  a quadratic  relationship.  Partition  the  interval  of  integration  from 
x = a to  x = b into  an  even  number,  2m,  of  equal  subintervals. 
Referring  to  Fig.  2,  let  the  endpoints  of  the  subintervals  be  x0  = a, 
xv  x2, ....  xm  = b.  Denote  the  length  of  a subinterval  by  h so  that 
xx  + h = xQ  + 2 h, ....  x2m  = x2m_x  + h = x0  + 2 mh. 


Fig.  2 

Consider 


I 

Xq=0 


h 


1=' 


*o  + 2 h 

dx=  j f(x) 

*0 


dx 


or  the  contribution  to  the  original  integral  of  the  first  two  subintervals. 
Replace  the  curve  y = f(x)  over  the  interval  from  x = x0  to  x = x2 
by  a parabola  passing  through  the  points  (x0,  y0),  (xvy1),  (x2,y2), 
and  approximately  replace  the  area  under  the  curve  by  the  area  under 
the  parabola.  The  equation  of  the  parabola  has  the  form  y = mx2  + 
+ nx  + l.  The  coefficients  m,  n,  l are  found  from  the  condition  that 
the  parabola  passes  through  the  three  given  points: 

’ y0  = mxl  + nxo  + l> 

4 yx  = mx\  + nxx  + l, 

. y2  = mx\  + nx2  + l 

But  this  leads  to  rather  cumbersome  computations. 

It  is  simpler  to  do  as  follows.  Seek  the  equation  of  the  parabola 
in  the  form 

y = A(x  — x0)  (x  — xx)  + B(x  — x0)  (x  — x2)  + 

+ C(x  - Xj)  (x  - x2)  (3) 


It  is  clear  that  what  we  have  on  the  right  of  (3)  is  a second-degree 
polynomial  so  that  (3)  is  indeed  the  equation  of  a parabola. 
Put  x = x0  in  (3)  to  get  y0  = C(x0  — xx ) (x0  — x2),  or  y0  = C2h2. 
Putting  x = xx  = x0  -f-  h in  (3),  we  get  = — Bh2.  Finally,  putting 
x = x2  = x0  + 2h  in  (3),  we  get  y2  = 2 Ah2,  whence 


A = , B = C = & * 

2 h2  h2  2h2 


(4) 


Note  that  the  following  problem  can  be  solved  in  similar  fashion:  to  find  a 
polynomial  of  degree  n that  takes  on  specified  values  for  n + 1 values  of  x. 
This  interpolation  problem  is  encountered  in  searching  for  empirical  formulas  (see 
Sec.  2.4),  in  operations  involving  functions  given  in  tabular  form,  and  elsewhere. 
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x,+2  h 

^ [A  (x  — x0)  ( X — xj  -f  B(x  - x0)  (x  — x2) 

*0 

+ C(x  - Xl)  {x  - x2)}  dx  = A^h*-B±h*  + clh* 

(When  performing  the  integration  it  is  convenient  to  make  a change 
of  variable:  x — xx  = s;  also  note  that  x0  = xx  — h,  x2  = x1  + h). 
Therefore 


dxxA  — h3  — B — h3  + C — h3 
3 3 3 


X0 


Using  (4)  we  find 


^ f(x)  dxx~  (v0  + 4yx  + y2)  (5) 


x « xzm 

In  the  same  way  we  can  evaluate  ^ f(x ) dx , ^ f(x)  dx.  For  the 


x2m-2 


integral  over  the  whole  interval  from  x = a to  x = b we  get 

b 

J /(*)  dxx^  [(_)'„  + 4 v,  + y2)  + (y2  + 4 v3  + y4)  + 

a 

•**  "f"  ( yim-2  H”  ^3;2tn-i  “f" 

whence 

b 

$/(*)  dx  « ~ [y0  + yim  + 2 {y2  + v4  + ...  + y^.2) 

a 

+ 4(>'l  + T3  + — + y*m- 1)]  (6) 

where  A = . This  formula  is  known  as  Simpson  s formula 

{Simpson  s rule). 


2 — 1634 
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If  we  rewrite  (5)  as 


*0+2  h 


*0 

then  we  find  that  Simpson's  formula  is  based  on  the  approximate 
replacement  of  the  mean  value  y of  the  function  y(x)  on  the  interval 
from  xQ  to  x0  -f  2 h by 


f*4*+*±*-  (7) 

3 6 

The  trapezoidal  formula  would  have  yielded 

yx±y1+yo±y±  (8) 

2 4 


(check  this).  To  verify  the  type  of  formula,  put  y(x)  k = constant, 
whence  y = k.  Then  both  formulas  become  exact.  From  the  deriva- 
tion of  the  formulas  it  follows  that  both  (7)  and  (8)  are  exact  for 
linear  functions  (of  the  type  y = px  + w),  and  formula  (7)  is  exact 
also  for  quadratic  functions  (of  the  type  y = px2  + nx  + /).  It  is 
interesting  to  note  that  (7)  is  actually  also  exact  for  cubic  functions 
(y  = px 3 + n%2  + lx  + k). 

If  the  interval  is  partitioned  into  the  same  number  of  subintervals, 
Simpson's  formula  is  as  a rule  much  more  exact  than  the  trapezoidal 
rule. 

Let  us  consider  some  examples. 

1.  Find  I = [— — • 

) 1 + X 

0 


Partition  the  interval  of  integration  into  two  subintervals  and 
evaluate  the  integral  using  the  trapezoidal  rule  and  then  Simp- 
son's rule. 

We  find  the  necessary  values  of  the  integrand: 

x 0 0.5  1 

y 1 0.6667  0.5 
The  trapezoidal  rule  yields 

I = j + 0.6667  j = 0.7083 

Simpson's  rule  yields 

I = - (1  + 4-0.6667  + 0.5)  = 0.6944 

6 
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Since  / = ln(l  + %) j * = In  2 = 0.6931  (to  four  decimal  places),  it 

I 0 

follows  that  Simpson's  rule  yielded  an  error  of  roughly  0.2%  and  the 
trapezoidal  rule  2%. 

i 

2.  Find  I — f In^  " ^ dx.  (Note  that  [ —~dx  cannot  be 
J l-f*2  J l + *2 

o 

expressed  in  terms  of  elementary  functions.)  Partition  the  interval 
as  in  Example  1,  then  find  the  required  values  of  the  integrand: 

% 0 0.5  1 

y 0 0.324  0.346 


By  the  trapezoidal  rule  we  get  1=  -j  + 0.324 j — 0.248. 

By  Simpson's  rule  we  have  I = — (0.324*4  -f-  0.346)  = 0.274. 

6 

The  exact  value  of  the  integral  (to  three  significant  places)  is  0.272. 
Therefore,  the  trapezoidal  rule  yielded  an  error  of  about  10%,  while 
Simpson's  rule  was  less  than  1 % in  error. 

The  advantage  of  Simpson's  rule  is  particularly  evident  as  the 
number  n of  subintervals  in  the  partition  increases.  In  that  case  we 
can  say  that  the  error  of  the  trapezoidal  rule  decreases  in  inverse  pro- 
portion to  n 2,  while  the  error  of  Simpson's  rule  decreases  in  inverse 
proportion  to  n 4 (if  the  third  derivative  of  the  integrand  remains 
finite  on  the  interval  of  integration). 

The  foregoing  examples  show  how  quickly  and  simply  numerical 
methods  permit  finding  integrals.  It  often  happens  that  even  when 
we  can  find  an  exact  formula  for  the  integral,  it  is  easier  to  evaluate 
it  by  numerical  integration. 

The  great  Italian  physicist  Enrico  Fermi,  builder  of  the  first 
atomic  pile,  preferred  (as  his  work  notes  show)  to  evaluate  an  integral 
numerically  even  if  it  could  be  expressed  in  terms  of  logarithms  and 
arc  tangents. 

Exercises 

1.  Evaluate  the  following  integrals  using  Simpson's  rule  and  the 
trapezoidal  rule,  the  interval  of  integration  being  partitioned  into 
four  subintervals: 


3 2 

(a)  (b)  \e-*'dx, 

1 0 


X 
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Carry  out  the  computations  to  within  slide-rule  accuracy.  The 
values  of  the  exponential  function  may  be  taken  from  any  mathe- 
matics handbook,  say  A Guide-Book  to  Mathematics  by  Bronstein 
and  Semendyayew  [3]. 

3 

2.  Find  ^ ~~~  by  Simpson's  rule  with  the  interval  of  integration 

o 

partitioned  into  4 subintervals.  Carry  the  calculations  to  four 
decimal  places.  As  a check,  evaluate  the  same  integral  with  a 
partition  into  6 subintervals. 

3.  Verify  directly  that  formula  (8)  is  exact  for  first-degree  polyno- 
mials, and  formula  (7)  is  exact  for  second-degree  polynomials. 
Hint . First  verify  the  exactness  of  these  formulas  for  the  function 
y = x,  and  of  (7)  for  the  function  y = x2. 

1.2  Computing  sums  by  means  of  integrals 

Consider  the  sums  of  several  numbers  obtained  by  a definite  law, 

say, 

S6=l  + 2-b3-{-4  + 5-i-(5 

or 

s96  = + 1/6  + ]/7  + ...  + ]f9 9 + yioo 

or 


c 1 , i , i , i - .i.i 

*S2i  — 1 + — + — + ~ + •••  + ~ TT 


1.1 


1.2 


1.3 


2.9  3.0 


Denote  the  sum  by  S.  The  subscript  indicates  the  number  of  terms 
in  the  sum.  Each  such  sum  may  be  written  more  compactly  if  we  are 
able  to  establish  a relationship  between  the  magnitude  of  a separate 
term  in  the  sum  and  the  number  of  that  term.  In  the  first  case,  the 
term  an  is  simply  equal  to  the  number  n,  which  varies  from  1 to  6, 
and  so  the  first  sum  is 

6 6 

5«  = £ a.  = Z)  n 

n=  1 n=  1 

It  is  sometimes  more  convenient  to  start  with  zero ; then  we  have 
to  put  n — 1 = m,  that  is,  n = m + 1,  which  yields 

s6  = ]£  (m  + *) 
m=0 

Instead  of  m we  could  write  n or  any  other  letter  since  the  summation 
index  is  a dummy  index , that  is  to  say,  one  in  which  the  sum  does 
not  depend  on  the  letter  used  to  indicate  it.  (In  this  respect,  the  summa- 
tion index  is  similar  to  the  variable  of  integration  in  a definite  integral ; 
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see  HM,  Sec.  2.8.)  Note  that  in  the  last  sum  the  index  varies  from  0 
to  5 and  the  number  of  terms  is  accordingly  equal  to  six. 

We  can  write  the  second  sum  in  similar  fashion,  thus: 

too 

Sge  = X2  1/  p 

P=  5 

However  it  is  better  if  the  index  varies  from  1 or  0 ; to  do  this,  put 
p — 4 = m or  p ~ 5 = n and  we  get 

96  95  

*$96  = Xy  ^ ^ 

m=  1 n= 0 

The  third  sum  is  of  the  form 

2°  1 

^21  ~ S 1 + °-ln 

We  see  that  it  may  be  regarded  as  the  sum  of  values  of  the  function 
f[x)  = — when  x assumes  values  from  x = 1 to  x — 3 at  intervals 

of  0. 1 (this  approach  will  be  found  to  be  useful  in  the  sequel).  In  other 
cases  as  well,  one  often  finds  it  convenient  to  view  a given  sum  as  the 
sum  of  values  of  a certain  function  f(x)  for  x varying  from  x = a 
to  x = b with  a constant  interval  Ax  = h. 

The  foregoing  sums  are  not  hard  to  compute,  but  if  a sum 
consists  of  a large  number  of  terms,  computation  can  be  a wearisome 
job.  What  is  more,  one  often  has  to  compare  sums  based  on  the  same 
law  but  with  different  numbers  of  terms.  In  this  case  it  is  desirable  to 
know  not  only  the  sum  for  a certain  definite  number  of  terms  but  also 
how  the  sum  depends  on  the  number  of  terms  involved. 

In  many  cases  it  is  possible  to  approximate  a sum  with  the  aid  of 
integrals.  Recall  that  the  study  of  definite  integrals  begins  with  an 
approximate  representation  of  the  integral  as  the  sum  of  a finite 
number  of  terms.  It  is  then  established  that  exact  integration  of  many 
functions  by  means  of  indefinite  integrals  is  quite  an  easy  affair. 
Let  us  take  advantage  of  this  for  computing  sums. 

Write  down  the  trapezoidal  formula  that  gives  an  approximate 
expression  of  an  integral  in  the  form  of  a sum.  If  the  interval  of  inte- 
gration between  x = a and  x = b is  partitioned  into  n equal  subinter- 
vals of  length  b ~ * = h each,  then 
« 
b 

^f{x)dxxh‘\pf(a)  +f{a  + h)  +f(a  + Ih)  + 

a 

...  +/(a  + {n  — 1)  h)  + "^/(a  + 


where  a + nh  = b. 
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From  this  formula  we  obtain  the  following  expression  for  the  sum : 

f{a)  +/(a  + f{a  + 2A)  + ...  +/(i) 

b 

+ j/(«)  + |/(6)  (9) 

a 

Dropping  the  term  /(a)  + -j /(6)  in  the  right  member  of  (9) 
— which  is  justified  if 

b 

dx>\f{a)  + \f{b) 


that  is,  if  the  number  of  terms  of  the  sum  is  great  and  the  function 
f(x)  does  not  vary  too  fast  — we  obtain  a cruder  formula: 

b 

f(a)  d“/(a  H”  fy  f{a  + 2A)  + ...  + /(£)  # — (10) 

a 

If  h is  small,  we  can  take  advantage  of  the  approximate  formulas 

. *4* 

\m*  \ $ mdx 


*~Th 


Then  from  (9)  we  get 

f(a)  + f(a  + h)  f(a  + 2 h)  + ...  + /(#) 


b+~h 


^f{x)dx- f-i  j f(x)  dx  + ±-  J f(x)dx 


a-~h 


or 


*4 


f(a)  + f(a  + h)  + f(a  + 2 h)  + ...  + f(b)  * A J f{x)  dx  (11) 


In  deriving  (10)  and  (11),  we  proceeded  from  formula  (9),  which  was 
obtained  from  the  trapezoidal  formula,  which  itself  is  approximate. 
For  this  reason,  although  (9)  and  (11)  are  more  exact  than  (10),  they 
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are  still  approximate  formulas.  The  accuracy  of  the  formulas  increases 
as  we  diminish  h and  increase  the  number  of  terms  in  the  sum. 

Let  us  apply  the  formulas  just  obtained  to  computing  the  sums 
given  at  the  beginning  of  this  section. 

For  the  sum  S6  = 21,  f(x)  — x,  h = 1.  By  (10)  we  get 


6 


1 


The  error  is  17%.  By  formula  (9)  we  get 


^6 


= 21 


And  finally  by  (11)  we  have 

~ J 2 

0.5 


(0-5)2 

2 


21 


Formulas  (9)  and  (1 1)  yielded  the  exact  result.  This  is  because  in  the 
case  at  hand  the  function  f(x)  is  linear  (see  Exercise  3). 

For  the  next  sum,  S9G  = |/5  + j/6  + ...  + J/ 100,  taking  into 
account  that  f(x)  — ]f  xf  h = 1,  we  get  by  (10) 

100 

\ ]jxdx  = - x3l2\m  = - (1000  — 5 ]/5)  = 659.2 

J 3 |5  3 


By  formula  (9)  we  get  S96  « 665.3  and  by  (11)  S96  « 665.3.  The 
sum  S96  computed  to  an  accuracy  of  four  significant  figures  is 
also  665.3. 

For  the  last  sum  f(x)  ~ — , h = 0.1.  And  so  by  (10)  we  have 

3 

Sai  « — f *1  = 10  In  3 = 10.99 
o.i  j « 

1 

By  (9)  we  get  S21  « 11.66  and  by  (11)  obtain  S21  « 11.51.  Computed 
directly,  to  two  decimal  places,  the  sum  S21  ^ 1F66. 

In  the  foregoing  examples,  the  formulas  (9)  and  (11)  give  very 
satisfactory  accuracy  and  so  they  should  be  used  in  practical  situations. 
The  crude  formula  (10)  may  be  useful  to  formulate  the  law  of  an  increas- 
ing sum  when  the  number  of  terms  increases  without  bound.  However, 
in  all  numerical  computations  with  a definite  number  of  terjns  in  the 
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sum,  one  should  use  (11),  which  is  not  any  more  complicated  than 
(10)  but  is  much  more  accurate*. 

In  all  the  cases  considered  above,  the  separate  terms  of  the  sum 
had  the  same  signs. 

If  the  terms  have  different  signs,  then  to  compute  the  sum  via 
an  integral,  one  should  consider  the  integral  of  a function  that  changes 
sign  several  times  over  the  interval  of  integration.  For  such  integrals 
the  trapezoidal  rule  yields  very  poor  results.  Therefore,  the  formulas 
we  have  derived  serve  only  for  sums  whose  terms  are  of  the  same  sign 
and  cannot  be  applied  to  sums  in  which  the  terms  have  different  signs. 
For  example,  these  formulas  cannot  be  used  to  find  sums  in  which 
the  signs  of  the  summands  alternate.  Sums  of  this  kind  are  called 
alternating.  An  example  of  an  alternating  sum  is 

Sg  = — — - + — — - + — — — L 

4 5 6 7 8 9 10  11 


How  do  we  find  the  sum  here  ? 

If  the  terms  of  an  alternating  sum  decrease  (or  increase)  in  abso- 
lute value,  so  that  the  absolute  value  of  each  successive  term  is  less 
than  (or  more  than)  the  absolute  value  of  the  preceding  term,  we  can 
use  the  following  device. 

Combine  adjacent  positive  and  negative  terms  in  pairs.  Then  the 
problem  reduces  to  computing  a sum  in  which  the  signs  of  all  terms  are 
the  same. 


For  example, 


We  will  assume  that  the  original  alternating  sum  consists  of  an 
even  number  of  terms.  Then  if  it  begins  with  a positive  term,  the  final 
term  is  negative.  Such  a sum  may  be  written  thus: 


Sfl+i  =/(#)  “■/(<*  + h)  +/(#  + 2h)  — f(a  + 3h)  + ...  — f(b)  (12) 


where  b = a + nh. 

The  difference  between  two  adjacent  terms  of  this  sum  can  be 
written  as 


Ha  + kh)  — f (a  + (k  + 1)  h)  x — h — 

dx 


The  use  of  (10)  involves  an  obvious  systematic  error  due  to  the  dropping  of  two 
terms  in  the  derivation  of  the  formula.  If  all  the  terms  are  the  same,  that  is. 
/(*)  = A — constant  and  (6  — a) jh  = n,  then  the  sum  at  hand  is  equal  to 
(w  + 1 )A,  whereas  (10)  only  yields  nA  (in  this  case,  (11)  yields  an  exact  value). 
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This  is  an  approximate  equation,  but  it  is  the  more  accurate,  the 
smaller  h is.*  The  sum  (12)  assumes  the  form 


S„tl  * - *[/'(«  + j)  +/'(«  + {*)  + - +f\b  ~ I)]  (13) 


Let  us  apply  (11)  to  the  right-hand  member.  Observe  that  in 
(11)  h is  the  difference  between  adjacent  values  of  the  inde- 
pendent variable.  In  formula  (13)  this  difference  is  equal  to 

a + | A + -i  j h — |a  + ~ j aJ  = 2h,  For  this  reason,  when  using 

(11)  we  have  to  take  2 h instead  of  h ; we  then  get 


5 


n+i 


\ f'{x)dx  = 

a+~h 


KHKJ 


(U) 


Let  us  use  this  formula  to  compute  the  sum 


8 4 5*6  7 + 8 9 + 10  11 


Here 


f(x)  = — , a = 4,  6=11,  h = 1 


and  so  we  have 


Sf>~  - — 1 = 0.0994 

2 l 3.5  11.5  j 

Direct  summing  gives  Sg  = 0.0968  (with  an  error  of  less  than  3%). 


If  we  put  a + kh  = x and  change  signs,  the  last  equation  can  be  rewritten 
as  f(x  + h)  — f(x)  « hff  -f-  — j.  It  is  easy  to  estimate  its  accuracy.  Indeed, 
by  Taylor's  formula, 

f(x  + h)  —f(x)  = /'(*)  h + f-^L  h3  + h3  + ... 

2 6 


Thus,  the  expansions  differ  only  beginning  with  the  third  order  terms  when 
compared  with  h. 
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If  an  alternating  sum  has  an  odd  number  of  terms,  then  the 
first  term  f(a)  and  the  last  term  f(b)  are  of  the  same  sign.  In  this  case, 
we  first  use  (H)  to  find  the  sum  without  the  last  term  f(b),  and  then 
add  that  term: 

5fl+1  = /(«)  -f(a  + h)  +f{a  + 2 h)  - ...  +f{b) 

M—IM*- - !)]+*) 

If  h is  small,  then  to  compute  f^b  ± we  can  confine  our- 
selves to  the  first  two  terms  of  Taylor’s  series: 

f(x)=f(b)+f'(b)-(x-b)  + ... 

Putting  x — b — ^ here,  we  find  &f(b)  — /'(£),  and 

putting  x = b + , we  get  f^b  + ~ /(£>)  + ^-/'(b).  And  so 

m - {/(&  - 1)  - \m  + \ */'(») 

={[/(*)+{/'(»)]  *i/(»+j) 

The  final  sum  for  the  case  of  an  odd  number  of  terms  is 


C r+~> 

1 ~ 


(15) 


This  formula  is  no  less  exact  than  (14),  with  the  aid  of  which  it  was 
obtained.  Consider  the  following  example: 

s = A_A  + A_A  + A_A  + A-_A_  + _L 

9 4 5^6  7^8  9 ' 10  11  * 12 

Using  (15)  we  get  S9  « A |_A-^__A_j  = 0.1829,  whereas  direct  sum- 
mation yields  S9=0. 1801.  The  error  amounts  to  roughly  1.5  %.  Observe 
that,  as  can  be  seen  from  (14)  and  (15),  the  magnitude  of  an  alternat- 
ing sum  is  of  the  same  order  as  the  separate  summands.  For  this 
reason,  adding  another  term  makes  an  essential  change  in  the  value. 
Taking  the  foregoing  example,  we  find  that  S9  is  almost  twice  S8. 

Note  the  essential  difference  between  an  alternating  sum  and 
a sum  with  terms  of  the  same  sign.  We  will  increase  the  number  of 
terms  in  the  sum  so  that  the  first  and  last  terms  remain  unaltered 
and  the  law  of  formation  of  the  sum  is  the  same.  To  do  this,  we  reduce 
the  difference  between  adjacent  terms  of  the  sum,  that  is,  we  reduce 
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the  size  of  k . Thus,  from  the  sum  1 + — + — we  can  obtain  the 

2 3 

i r ltl,l  - .1  -.1.1,  .1 

sum  1 -f h \~  •••  H or  1 f-  ...  -f-  • — • 

1.1  1.2  1.3  3 1.01  1.02  3 

If  all  the  terms  of  the  sum  have  the  same  sign,  then  the  sum  is 
roughly  proportional  to  the  number  of  terms.  This  is  evident,  say,  in 
the  case  of  formula  (10).  Indeed,  the  integral  in  the  right-hand  mem- 
ber of  (10)  is  preceded  by  a factor  1 jh,  where  h is  inversely  propor- 
tional to  the  number  of  terms  in  the  sum.  Therefore,  in  the  process 
just  described  the  magnitude  of  the  sum  increases  indefinitely  with 
growth  in  the  number  of  terms. 

In  the  case  of  an  alternating  sum  with  an  even  number  of 
summands,  the  magnitude  (with  the  number  of  terms  increasing 
as  described)  approaches  a definite  number  that  is  independent 
of  the  number  of  terms  in  the  sum,  namely. 


m -f{b) 


(16) 


This  is  readily  discernible  in  formula  (14),  since  for  a large  number 
of  terms  h will  be  close  to  zero  and  for  this  reason  /(« - |A)  Kf(a)> 

/(&-}-  -i/ij  « f(b).  The  case  is  similar  for  an  odd  number  of  terms; 
from  (15)  we  obtain  in  the  limit  a different  value,  namely: 

5 « M+m  (17) 


Observe  that  when  the  number  of  summands  is  small,  that 
is,  when  h is  great,  the  simplified  formulas  (16)  and  (17)  are  much 
worse  than  (14)  and  (15).  Let  us  consider  an  example.  Suppose 
S = 1 — 2 + 3 — 4 = — 2.  Using  the  simplified  formula  (16),  we 

get  S fa (an  error  of  25%),  whereas  formula  (14)  yields 

which  is  exact. 

The  expressions  for  the  sums  that  we  have  just  obtained  are 
approximate  expressions,  and  their  accuracy  increases  when  adja- 
cent terms  are  closer  together,  that  is,  when  h decreases. 


Exercises 

1.  Find  the  sum  of  l+f2  + f3+... + using  formula  (11). 
Compare  with  the  results  obtained  by  direct  summation  for 
n = 3,  4,  5. 
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2.  Find  the  sum  of 

1 H ^=--i — ...  -t 

2\2  3^3  20 1' 20 

3.  Prove  that  formulas  (9)  and  (11)  are  absolutely  exact  if  f(x) 
is  a linear  function.  The  terms  of  the  sum  then  form  an  arithmetic 
progression. 

1.3  Numerical  solution  of  equations 

In  practical  computations  one  rather  often  has  to  deal  with 
the  numerical  solution  of  equations  like 

f(x)  = 0 (18) 

where  / is  a given  function.  These  equations  may  be  algebraic  or 
transcendental  (that  is,  nonalgebraic,  for  instance,  trigonometric 
and  the  like).  They  are  sometimes  grouped  together  and  termed 
finite , in  contrast  to,  say,  differential  equations.  Today  there  are  a 
large  number  of  methods  for  solving  the  various  classes  of  equations 
of  type  (18).  We  consider  here  only  three  of  the  more  universal  methods 
that  find  extensive  applications  in  other  divisions  of  mathematics 
as  well. 

One  ordinarily  begins  with  a rough  and  very  approximate 
solution,  the  so-called  zero-order  approximation . If  a physics  problem 
is  being  tackled,  this  rough  solution  may  be  obtained  from  the  physi- 
cal essence  of  the  problem  itself.  One  way  is  to  sketch  a graph  of 
the  function  f{x)  and  obtain  a rough  solution  by  marking  the  point 
of  intersection  of  the  graph  and  the  %-axis. 

Suppose  we  have  such  a rough  solution  x0.  Then  denote  the 
exact  solution,  still  unknown,  by  x = x0  -(-  h.  Using  Taylor’s 
formula,  we  get 

f(xo  + h)=f(xo)+ff'(*0)  + - (19) 

But  the  left-hand  member  must  be  equal  to  zero.  Dropping  the 
dots,  that  is  to  say,  the  higher-order  terms,  we  get 
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Fig.  3 


Denoting  the  right  member  by  x1 

= x0 


fix  o) 

fix  o) 


(20) 


we  get  the  “first  approximation”  (the  subscript  indicates  the  number 
of  the  approximation).  Doing  the  same  thing  again,  we  get  a “second 
approximation” : 


x 


2 


Xl-1M 

fix,) 


and  so  on.  Since  rejecting  higher  terms  in  (19)  is  equivalent  to  replac- 
ing the  graph  of  the  function  f(x)  by  a tangent  to  it  at  * = x0,  the 
geometrical  meaning  of  this  method  consists  in  a sequential  construc- 
tion of  tangents  to  the  graph  and  finding  the  points  where  they  inter- 
sect the  #-axis  (Fig.  3).  It  is  clear  that  the  successive  approximations 
rapidly  converge  to  the  desired  solution,  provided  of  course  that 
the  zero-order  approximation  was  not  too  far  away. 

This  is  called  Newton  s method  (or  the  method  of  tangents,  in  accord 
with  its  geometrical  meaning).  The  convenience  of  this  method  is 
that  one  only  needs  to  compute  the  values  of  the  function  f{x)  and 
its  derivative  and  perform  arithmetical  operations,  which  is  a simple 
matter  in  the  case  of  a function  specified  by  a formula. 

Let  us  take  an  example.  Suppose  we  have  to  solve  the  equation 


x3  — 2>x  — 1=0 


(21) 
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Since  the  left-hand  side  is  equal  to  —3  for  x — 1 and  to  1 for  x = 2, 
there  is  a root  lying  between  x = 1 and  x = 2 and  it  is  naturally 
closer  to  2 than  to  1.  So  we  take  x0  = 2.  Then  (20)  yields 


*i 


1.889 


Similarly,  we  get  (to  three  decimal  places) 

*2  = 1.879,  #3  = 1.879 

And  so,  to  the  accuracy  given,  the  solution  is  x = 1.879. 

It  is  interesting  to  note  that  in  this  example  we  could  have 
written  the  exact  solution  to  the  equation.  Already  in  the  16th 
century  the  Italian  mathematician  Cardano  published  a formula 
for  the  solution  of  the  cubic  equation 

x3  + px  + q = 0 * 

which  has  the  form 

*=  |/r_T+l/T+f7+  ]f + h 

However,  if  we  substitute  into  this  formula  the  values  of  the  co- 
efficients of  equation  (21),  then  under  the  cube  root  symbol  we  find 
the  imaginary  numbers  0.5  i i]/0.75  and  it  is  only  the  sum  of 
these  roots  that  is  real.  Which  means  we  have  to  extract  the  roots 
of  imaginary  numbers  (cf.  Sec.  5.4).  Thus,  even  in  this  simple  example 
Newton's  method  proves  to  be  much  simpler  than  the  “exact"  for- 
mula, so  what  is  there  left  to  say  about  quartic  equations  where 
the  exact  formula  is  so  unwieldy  that  it  is  not  even  written  out  in 
full  in  reference  books!  Or  of  equations  higher  than  the  fourth  degree 
and  also  of  most  of  the  transcendental  equations  where  there  is 
no  “exact"  formula  at  all!  For  such  equations  the  advantage  of  nume- 
rical methods  is  particularly  obvious. 

Newton's  method  is  one  of  a group  of  iteration  methods  (or 
methods  of  successive  approximation)  in  which  a unified  procedure 
is  repeated  in  succession  (iterated)  to  yield  ever  more  exact  approx- 
imate solutions.  In  general  form,  the  iteration  method  as  applied 
to  equation  (18)  looks  like  this:  the  equation  is  written  in  the  equi- 
valent form 


* = <?(*) 


(22) 


The  general  equation  of  the  third  degree  ay3  + by 2 + cy  + d = 0 is  reduced  to 

b 

this  form  by  the  substitution  y = x — — • 

3 a 


Sec.  1.3 


31 


Numerical  solution  of  equations 


Then  a value  x = x0  is  chosen  for  the  zero-order  approximation, 
and  the  subsequent  approximations  are  computed  from  the  formulas 
*1  = ?(*<>)»  *2  = ?{xi)>  — , and,  generally, 

*«+i  = 9 (*»)  (23) 

Here  two  cases  are  possible: 

(1)  the  process  can  converge , that  is,  the  successive  approxima- 
tions tend  to  a finite  limit  x ; then,  passing  to  the  limit  in  (23)  as 
n ->  oo,  we  see  that  x — x is  a solution  of  equation  (22) ; 

(2)  the  process  can  diverge , that  is,  there  is  no  finite  limit  to 
the  constructed  “approximations” ; from  this  it  does  not  follow  that 
there  is  no  solution  to  (22) ; it  might  simply  mean  that  the  chosen 
iteration  process  is  not  suitable. 

We  illustrate  this  in  a simple  equation  that  does  not  require 
any  “science”  to  solve: 

* = f + 1 (24) 

with  the  obvious  solution  ~x  — 2.  If  we  put  x0  = 0 and  compute 
to  three  decimal  places,  we  get 

xx  = 1 ; x2  = 1.5;  x3  = 1.75;  x4  = 1.875;  x5  = 1.938; 

*6  = 1.969;  x7  = 1.984;  *8  = 1.992;  *9  = 1.996; 

x10  = 1.998;  xn  = 1.999;  x12  = 2.000;  xlz  — 2.000 


which  shows  the  process  actually  did  converge.  The  limit  can  be 
found  more  quickly  if  one  notices  that  in  the  given  case  the  dif- 
ferences between  successive  approximations  form  a geometric  pro- 
gression with  first  term  a = xx  — xQ  and  ratio  q = ~~  . There- 

„ xx  "7  xo 

fore  the  sum  of  the  entire  progression,  that  is,  x — x0,  is 


whence 


a _ *i  - xo  _ (*x  ~ *q)2 

1 — q ^ x2  — xx  2xl  — x0  — x2 

~ * o 

X = X0+  *L-*o**  _ 

2xx  — x0  — x2  2xl  — x0  — x2 


(25) 


In  more  complicated  examples  the  succesive  differences  are  only 
reminiscent  of  a geometric  progression;  in  such  cases,  formula  (25) 
does  not  yield  an  exact  solution,  but  does  enable  one  to  skip  over  a 
few  approximations  and  obtain  an  approximation  from  which  it  is 
again  possible  to  begin  iteration.  (This  is  Aitkens  method). 

If  (24)  is  solved  for  the  # in  the  right-hand  member,  then  we 
can  rewrite  the  equation  in  the  equivalent  form 

x — 2x  — 2 (26) 
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and  begin  with  x0  = 0,  then  we  successively  get  xx  = — 2,  x2  = — 6, 
x3  = — 14.  Here  the  process  will  not  converge.  This  could  have 
been  foreseen  by  noting  that  (23)  implies 

xn+1  ~ xn  = <?{x„)  — cp(xnJ 


that  is, 


*8  - *1  = ?(*i)  - 9(*o)> 


*3  — *2  = ?(*a)  — 


*4-  *3=  ?(*s)  ~ 


Thus,  if  the  values  of  the  function  <p(x)  vary  more  slowly  than  the 
values  of  its  argument,  the  distances  between  successive  approxi- 
mations will  become  smaller  and  smaller,  and  if  the  values  of  the 
function  y(x)  vary  faster  than  those  of  the  argument,  the  distances 
between  successive  approximations  will  progressively  increase  and 
the  process  will  diverge.  Since  the  rate  of  change  of  the  values  of 
9 (x)  relative  to  x is  equal  to  <?'(x),  we  find  that  if 

| <p'(x)  | <£<  1 (27) 

then  the  process  of  iteration  converges  and  it  converges  the  faster, 
the  smaller  k is;  but  if 

I ?'(*)  I > 1 

then  the  process  diverges.  These  inequalities  must  hold  for  all  x 
or  at  least  near  the  desired  root  x of  (22). 

We  see  that  the  equations  (24)  and  (26)  are  completely  equi- 
valent but  they  generate  distinct  iteration  processes.  (In  order  to 
explain  this  difference,  there  is  no  need  even  to  rewrite  (24)  as  (26) ; 
it  suffices  to  notice  that  in  order  to  obtain  the  next  approximation, 
one  has  to  substitute  the  subsequent  approximation,  via  the  first 
method,  into  the  right  member  of  (24)  and,  via  the  second  method, 
into  the  left  member  of  the  same  equation.)  In  other  cases  as  well, 
equation  (18)  can  be  rewritten  in  the  form  (22)  in  many  ways,  each 
of  which  generates  its  own  iteration  method.  Some  of  the  methods 
may  prove  to  be  rapidly  convergent  and  therefore  more  convenient, 
while  others  will  turn  out  slowly  convergent,  and  still  others  will 
even  be  divergent. 

The  above-mentioned  solution  of  (21)  can  now  be  understood 
as  follows:  the  equation  is  rewritten  in  the  equivalent  form 

*3 3 AT 1 

X = X 

3*2  — 3 

after  which  the  iteration  method  is  applied  beginning  with  x0=  2. 
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In  general  form,  Newton's  method  for  equation  (18)  reduces 
(see  (20)  and  the  subsequent  equations)  to  this  equation  being  rewrit- 
ten in  the  equivalent  form 


x = x — 


m 

/'(*) 


(28) 


with  the  iteration  method  applied  afterwards.  This  form  might 
appear  to  be  somewhat  artificial,  although  it  is  easy  to  demonstrate 
the  equivalence  of  the  equations  (18)  and  (28):  from  (18)  follows 
(28)  and  conversely,  but  it  is  not  immediately  apparent  how  the 
denominator  f'(x)  helps.  However  it  is  easy  to  verify  that  the  deri- 
vative of  the  right  member,  that  is, 


f'f'-ff"  _//" 

J'2  f'  2 


vanishes  when  x — x,  where  x is  the  solution  of  (18).  Hence  (see 
the  reasoning  involving  the  estimate  (27)),  the  closer  the  successive 
approximations  approach  x,  the  faster  the  process  converges. 

What  is  more,  since,  when  (27)  holds,  the  iterations  do  not 
converge  more  slowly  than  a progression  with  ratio  k , it  follows  that 
Newton's  method  converges  faster  than  a geometric  progression 
with  any  ratio!  * 

Let  us  now  take  up  the  so-called  perturbation  method , which, 
like  the  method  of  iterations,  is  one  of  the  most  universal  methods 
in  applied  mathematics.  A simple  example  will  serve  to  illustrate 
this  method. 

Suppose  it  is  required  to  find  the  solution  of  the  transcen- 
dental equation 

e*-1  = 2 — x + a (29) 

for  small  |a|.  Notice  that  for  a — 0 a solution  can  be  found  by 
inspection:  x = 1.  Therefore  if  a solution  of  (29)  dependent  on  a 
is  sought  by  means  of  a series  expansion  in  powers  of  a, 


x = x0  + aoc  + 6a2  + 


* 


A simple  typical  example  will  make  it  easy  to  establish  the  rate  of  this  conver- 
gence. Suppose  we  are  considering  the  approximations  by  Newton’s  method  to 
the  zeroth  root  of  the  equation  x + x2  — 0.  These  approximations  are  connected 
by  the  relation 


1 4-  2r„ 


< xl 


Suppose  0 < x0  < 1.  Then  we  successively  get 

*!  < x2<  x\  < x3  < x\  < *5 xn  < *3 ”,  ... 

It  can  be  demonstrated  that  in  the  general  case  as  well  the  rate  of  convergence 
of  Newton's  method  is  of  the  order  of  a2n( 0 < a < 1). 
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then  by  substituting  a = 0 we  find  that  x0  must  be  equal  to  1.  Now 
substitute  the  expansion 

x = 1 + aa  + 6a2  + ca3  -(-  ...  (30) 

into  both  members  of  (29)  and  take  advantage  of  the  familiar 
Taylor  expansion  of  an  exponential  function  to  get 

j.  fltx  4-  ba 2 + ca3  4-  ...  (aa  4 6a2  4-  ca3  + ...)2 
1!  2I- 

(aa  + 6a2  -f-  ca3  + ...)3  . ~ /1  , itor  -»  , Xl 

+ + ...  = 2 —(1  + aa  + 6a2  + ca3  + ...)  + a 


Removing  brackets  and  retaining  terms  up  to  a3,  we  get 

1 + aa  -f  + ca3+  - a2  + abx3  + - a3  + ... 

2 6 

= 1 — — b<x2  — ca3  — ...  -j-  a 


Equating  the  coefficients  of  equal  powers  of  a in  both  members, 
we  get  the  relations 

a = — a + 1, 

7 . a2  j 

b + -=-b, 

c — J~  (xb  -j—  — — Ct 

6 


whence  we  successively  find 


l t 

— , c = — , ... 
16  192 


Substituting  these  values  into  (30),  we  obtain  the  desired  solution 
to  (29)  in  the  form  of  the  series 


i i a a2  , a3  . 

* = H b 

2 16  192 


(31) 


that  converges  very  well  for  small  | a |. 

In  the  general  case,  the  perturbation  method  is  applied  in  the 
following  manner.  In  the  statement  of  a certain  problem,  let  there 
be  a certain  parameter  a in  addition  to  the  basic  unknown  quanti- 
ties, and  for  a = 0,  let  the  problem  be  more  or  less  easy  to  solve 
(the  unperturbed  solution).  Then  the  solution  of  the  problem  for  a 
close  to  0 can  in  many  cases  be  obtained  by  an  expansion  in  powers 
of  a to  a certain  degree  of  accuracy.  Here,  the  first  term  of  the  expan- 
sion not  containing  a is  obtained  when  a = 0,  which  means  it  yields 
an  unperturbed  solution.  Subsequent  terms  yield  corrections  for 
“perturbation”  of  the  solution.  These  corrections  are  of  first,  second. 
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etc.  orders  with  respect  to  a.  The  terms  themselves  are  often  found 
by  the  method  of  undetermined  coefficients , that  is,  the  coefficients 
of  a,  a2  and  so  on  are  designated  by  certain  letters  which  are  then 
found  >from  the  conditions  given  in  the  statement  of  the  problem. 

The  perturbation  method  yields  good  results  only  for  small 
| a |.  For  large  [ a | the  method  can  lead  to  fundamental  errors  because 
it  may  turn  out  that  rejected  terms  (those  not  written  down)  are  more 
essential  than  the  ones  that  are  left. 

Thus,  the  perturbation  method  makes  it  possible,  by  proceeding 
from  the  solution  of  certain  “central”  problems,  to  obtain  a solution 
of  problems  whose  statement  is  close  to  the  “central”  ones,  if  of 
course  the  formulation  does  not  involve  any  fundamental,  quali- 
tative, change  in  the  solution. 

In  many  problems,  the  very  aspect  of  the  first-order  term  ena- 
bles one  to  draw  some  useful  conclusions  concerning  the  dependence 
of  the  solution  upon  the  parameter  in  the  case  of  a small  variation 
of  the  parameter. 

The  perturbation  method  is  directly  connected  with  the  itera- 
tion method;  this  we  demonstrated  in  the  case  of  equation  (29). 
First  of  all,  it  is  convenient  that  the  unperturbed  solution  be  the 
zeroth  solution.  This  is  attained  by  means  of  the  substitution 
x = 1 + y9  whence 

ey  = 1 — y + a (32) 

To  carry  out  iterations,  observe  that  if  the  desired  root  y of  the 
equation  is  close  to  0,  the  equation  can  be  conveniently  rewritten 
in  a form  y = <p(y)  such  that  the  expansion  of  <p(y)  in  powers 
of  y does  not  contain  the  first  power  but  only  the  constant  term, 
and  terms  containing  y2,  y3  and  so  on.  Then  the  expansion  of  <p'(y) 
will  not  contain  the  constant  term  and  so  for  y close  to  y and,  hence, 
to  0,  | 9 r{y)  | will  assume  small  values,  and  this  yields  the  convergence 
of  the  iteration  method.  Therefore,  in  the  original  equation,  transpose 
all  terms  involving  the  first  power  of  y to  the  left-hand  member, 
and  all  other  terms  to  the  right-hand  member,  and  then  perform 
the  iterations. 

In  our  example,  the  left  member  of  equation  (32),  ey , by  virtue 
of  Taylor's  formula, 

*"=i  +y  + yi+yi+- 

contains  the  term  yt  and  so  it  must  be  written  as 

ev  = y + {ev  — y) 

the  expansion  of  the  parenthesis  no  longer  containing  y in  the  first 
power.  And  so  in  place  of  (32)  we  write 

y + (ev  — y)  = i — y + « 
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whence,  grouping  terms  with  y (but  not  removing  the  brackets!), 
we  arrive  at  the  equation 


y = - + --- (ev -y) 
J 2 2 2 V ■7/ 


This  equation,  which  is  equivalent  to  (32),  is  already  prepared  for 
iteration.  Expanding  the  right-hand  side  in  powers  of  v,  we  get 


y = 


yL 

12 


(33) 


Now  we  can  carry  out  the  iterations  beginning  with  y0  = 0.  Sub- 
stituting this  value  into  the  right  member  of  (33),  we  get 


(34) 


Substituting  y — y}  into  the  right  member  of  (33),  we  obtain 

a a2  a3 

y2= ... 

J 2 16  96 


(35) 


We  see  that  the  first  approximation  (34)  coincides  with  the  exact 
solution  (31)  up  to  terms  of  the  first  order,  the  second  approximation 
(35),  to  within  second-order  terms.  To  construct  the  third  approxi- 
mation we  can  simply  substitute  into  the  right  member  of  (33) 


We  then  have  an  expansion  that  coincides  with  the  exact  one  up 
to  third-order  terms  inclusive,  so  that  only  these  terms  need  be 
taken  into  account  in  computations  and  so  on.  This  enables  one, 
when  computing  each  subsequent  iteration,  to  deal  only  with  finite 
sums. 


Exercises 

1.  Apply  Newton’s  method  to  equation  (21)  beginning  with  the 
value  x0  = 0,  x0  = 1. 

2.  Rewriting  (21)  as  x = , apply  the  iteration  method  to  it 

beginning  with  the  value  xQ  = 0,  1,  2. 

3.  Solve  the  equation  x3  + <xx  — 1 = 0 for  small  | a ! by  the  per- 
turbation method. 


ANSWERS  AND  SOLUTIONS 


Sec.  1.1 

1.  By  the  trapezoidal  rule:  (a)  1.12,  (b)  0.88,  (c)  1.84. 
By  Simpson's  rule:  (a)  1.10,  (b)  0.88,  (c)  1.85. 


Answers  and  solutions 
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2.  Partitioning  the  interval  of  integration  into  four  subintervals, 

3 

we  find  ^ ~~~~  = 0.6275.  Partitioning  the  interval  of  inte- 
o 

3 

gration  into  6 intervals,  we  also  obtain  C -A  dx~  = 0.6275. 

J ex  + 1 
o 

3.  Consider  the  formula  (8)  for  the  function  y = x over  the  interval 
from  x0  to  x0  + 2h.  Here 


x „ -:-2/ 1 


J7  = — c xix-±  |».  + ay  - = + «■  = x + i 

J 2 h J 2A  2 4A  0 


whereas 
l 


- Vi 


. v0  4*  r: 


- = T ('ro  + h)  + --0  + ("°  + 2h)  =x0  + k 


The  results  coincide,  that  is,  the  formula  in  this  case  is  exact. 
From  this  follows  the  exactness  of  (8)  for  the  function  y = ax 
(a  = constant)  as  well,  since  a serves  as  a common  factor  in 
all  terms.  Writing  down  the  exact  equation  (8)  for  y = ax  and 
y = k (which  has  already  been  verified)  and  adding  the  left 
members  and  the  right  members,  we  find  that  (8)  holds  true  for 
the  function  y — ax  + k as  well.  Equation  (7)  is  considered 
similarly. 


Sec  1.2 

1.  In  our  case  f{x)  = f x,  h = 1,  a = 1,  b — n.  By  (11)  we  get 


n 4-  !/•>  «-}-  l/o 


For  n = 3 the  formula  yields  Sa  = 3.34;  direct  computation 
yields  S3  = 3.70.  The  error  is  11%. 

For  n = 4 we  obtain  S4  = 4.93  by  the  formula  and  S4  = 5.29 
by  direct  calculation.  The  error  is  7%.  For  n = 5 we  have  by 
the  formula  S5  = 6.64,  by  direct  calculation  S5  = 7.00.  The 
error  is  5%. 


w to 
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By  formula  (11)  we  get  S = 2.39. 

Let  f(x ) = px  -f-  k,  that  is,  the  sum  is  of  the  form 

Sn+1  = [pa  + k]  -f-  [p(a  -(-  h)  -f-  k~\  -j-  [p(a  + 2 h)  + £]  + 

...  + [pb  + k] 

where  b = a -j-  nh.  We  thus  have  an  arithmetic  progression  with 
first  term  = pa  + k and  difference  d = ph.  The  sum  of 
such  a progression  is,  as  we  know,  equal  to 

Sfl+i  = fuLSsi.  (W  + 1)  = (pa  ± fe)  ± {pb  ± k)  (n  + 1) 

Apply,  say,  (9)  to  get  the  value 

b 

~ j {px  + k)dx- f ^{pa  + k)  + ~(pb  + k) 

=^r(^+*M^+*) 

= (p^  + k]{n+l) 

which  coincides  with  the  exact  value  for  the  sum. 

Sec.  1.3 

1.  For  aTq  — O we  obtain  another  root  x = — 0.347  of  equation  (21). 
For  #0  = 1 the  method  cannot  be  applied  (the  denominator 
vanishes). 

2.  For  Xq  = 0 and  x0  — 1 we  get  convergence  to  the  root  x from 
Exercise  1.  For  #0  = 2 the  iteration  process  diverges. 

3.  The  expansion  of  the  solution  is  of  the  form 

l l „ , 

x = — a a3  + ... 

3 81 


Chapter  2 

MATHEMATICAL  TREATMENT  OF 
EXPERIMENTAL  RESULTS 


■ In  practical  work,  it  often  happens  that  a rela- 
tionship is  obtained  between  variable  quanti- 
ties, say  x and  y,  in  the  course  of  experimenta- 
tion and  measurements.  Ordinarily,  such  a rela- 
tionship is  given  in  the  form  of  a table  with 
each  value  of  x at  which  the  measurement  was 
taken  indicated  and  paired  with  the  appro- 
priate measured  value  of  y.  In  this  chapter 
we  first  examine  the  general  rules  for  operating 
with  tables  and  then  certain  new  aspects  that 
arise  when  processing  the  results  of  experi- 
ments. 

2.1  Tables  and  differences 

Very  often  one  has  to  consider  functions  given  in  tabular 
form,  such  as  mathematical  tables,  say,  tables  of  logarithms, 
sines,  squares,  etc.  There  are  also  tables  of  physical  quantities 
taken  from  handbooks,  such  as  tables  of  the  boiling  point  of 
a liquid  versus  pressure,  and  the  like.  Finally,  a relationship 
between  variable  quantities  can  be  obtained  in  the  form  of  the 
unprocessed  results  of  experiments  or  measurements.  In  all 
such  cases,  the  numerical  values  of  the  dependent  variable  are  given, 
in  a table,  for  definite  numerical  values  of  the  independent  variable. 
The  functions  thus  specified  can  be  used  in  further  operations;  for 
instance,  it  may  be  required  to  differentiate  or  integrate  such  func- 
tions. There  may  be  a need  of  function  values  for  values  of  the  inde- 
pendent variable  not  given  in  the  table  (interpolation  problem) 
or  for  values  of  the  independent  variable  that  go  beyond  the  limits 
of  the  table  (extrapolation  problem ). 

For  the  sake  of  simplicity,  we  assume  that  the  independent 
variable  x takes  on  values  that  form  an  arithmetic  progression,  that 
is,  x = x0,  x = xx  = x0  + hy  x = x2  = x0  + 2 h,  ...,  x = xn  = x0  + 
+ These  values  of  the  argument  will  be  called  integral  values 
and  h will  be  the  tabulation  interval.  The  corresponding  values  of 
the  function  entered  in  the  table  will  be  denoted  by  y0=y(x 0), 

yi  = y(xi)’  •••.  y.  = 

The  increments  in  the  variable  x are  all  the  same  and  are  equal 
to  h . The  increments  in  the  variable  y are,  generally  speaking,  dif- 
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ferent.  They  are  called  first-order  differences  (or  differences  of  the 
first  order)  and  are  denoted  by  8y1{2  = — y0f  8yWj2  = y2  — Jv 

Sy2+ 1/2  = ^7  y 2»  ••• > Syn_i/2  = yn  — y«_i  since  they  are  naturally 
associated  with  half-integral  values  of  x,  that  is,  the  midpoints  be- 
tween adjacent  integral  values  of  the  argument: 

*1/2  = *0  + hft,  *1+1/2  = *0  + 3^/2>  *»— 1/2  = *0  + (n  — 1/2) h * 

From  these  differences  we  can  again  take  differences  to  obtain  so- 
called  second-order  differences , again  defined  for  integral  values  of  x: 

&2yi  = fyi+t/2  — Syi/2,  S2y2  = 83/2+1/2  — Syi+1/2, 

S2yw_i  = Sy„_i/2  — 8yn-3j2 

(the  superscript  2 indicates  the  order  of  the  difference  and  is  not 
an  exponent),  and  so  on. 

To  illustrate,  we  give  a portion  of  a table  of  common  logarithms 
with  computed  differences  multiplied  by  105. 


Table  1 


k 

0 

1/2 

1 

3/2 

2 5/2 

3 

7/2 

xk 

10.0 

10.05 

10.1 

10.15 

10.2  10.25 

10.3 

10.35 

ytc 

1.00000 

1.00432 

1 

.00860 

1.01284 

105  %yk 

432 

428 

424 

419 

105  S2yfc 

—4 

—4 

* — 5 

10=  §3>’* 

0 

—1 

2 

k 

4 

9/2 

5 

11/2 

6 

13/2 

7 

xk 

10.4 

10.45 

10.5 

10.55 

10.6 

10.65 

10.7 

ytc 

1.01703 

1.02119 

1.02531 

1.02938 

10s  Syk 

416 

412 

407 

10*  82yfc 

-3 

—4 

—5 

105  S3yfc 

—1 

— 1 

The  smallness  of  the  second-order  differences  compared  with 
the  first-order  differences  and  their  practical  constancy  (the  third- 
order  differences  are  of  the  order  of  rounding  errors)  in  this  example 


Sometimes  the  difference  y^+1  — yk  is  associated  not  with  the  value  * — 
but  with  x = xk.  Then  it  is  ordinarily  denoted  as  A yk  = A y\x=xjc  = y/c+i  ~ yk- 
With  our  notation,  that  is,  yk+l  — yk  — 8yk+l/ 2 = 8y\x=x  , the  differences 
are  called  central  differences. 
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points  up  the  smoothness  of  the  function  and  the  absence  of  any 
random  irregularities.  Such  regularity  may  appear  in  higher-order 
differences  and  always  indicates  a regularity  in  the  variation  of  the 
function  (cf.  Exercise  2).  Of  course,  if  the  tabulation  interval  is  not 
small,  and  also  close  to  discontinuities,  and  the  like,  the  difference 
may  not  be  small,  but  some  sort  of  regularity  is  ordinarily  apparent. 

Differences  are  widely  used  in  interpolation.  Let  it  be  required 
to  find  the  value  of  y for  a certain  value  of  x between  the  tabulated 
values  of  xk  and  #A+i.The  simplest  approach  is  that  of  a linear  inter- 
polation, which  consists  in  an  approximate  replacement  of  the  function 
under  study  by  a linear  function  and  in  a manner  such  that  both 
functions  are  coincident  for  x = xk  and  x=xk+i  (Fig.  4).  Geometrically, 
this  amounts  to  replacing  the  arc  AB  of  the  unknown  graph  shown 
dashed  in  Fig.  4 by  the  chord  AB  joining  two  known  points  A and  B . 
Set  x — xk  — s.  Since  a linear  function  is  expressed  by  a first-degree 
equation,  the  desired  value  of  y depends  on  s by  the  equation 

y = a + bs  (1) 

where  a and  b are  certain  coefficients.  From  the  conditions  at  xk 
and  xk¥l  we  find  that  yk  — a , yk+l  = a + bh,  whence 

1/2  = yk+i  — yk  = bh 

Expressing  a and  b from  this  equation  and  substituting  into  (1), 
we  get  the  final  formula  for  a linear  interpolation: 

y = yt  + Syk+t/2  j (2) 

h 

(Derive  this  formula  from  the  similarity  of  triangles  in  Fig.  4.) 
Formula  (2)  can  be  used  if  the  function  under  study  on  the  interval 
from  xk  to  xk+1  is  only  slightly  different  from  a linear  function,  that 
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is  to  say,  if  h is  sufficiently  small.  * For  k — 0 and  s < 0,  (2)  effects 
a linear  extrapolation  of  the  function  towards  x < x0,  and  for 
k = n — 1 and  s > h,  towards  x > xn.  Of  course,  in  the  case  of  extra- 
polation, one  should  not  go  far  beyond  the  tabulated  values  of  x 
since  our  linear  law  of  variation  of  the  function  is  justified  only 
over  a small  interval  in  the  variation  of  x. 

The  formula  of  linear  interpolation  (2),  like  subsequent  formulas 
as  well,  can  be  written  in  a form  that  does  not  contain  differences. 
Substituting  the  expression  Sy*+i/2  = yk+\  — y}.  into  (2),  we  get 
an  equivalent  formula: 

It  is  quite  evident  that  when  $ varies  from  0 to  h , the  coefficient 
of  yk  varies  from  1 to  0,  and  the  coefficient  of  yb+l  from  0 to  1.  And 
so  for  s - 0 we  get  y = yk  and  for  s = h we  have  y = yk+ 1. 

More  accuracy  is  obtained  by  quadratic  interpolation , in  which 
the  function  at  hand  is  approximately  replaced  by  a quadratic  func- 
tion in  such  a manner  that  both  functions  coincide  for  x = xk,  xk+\ 
and  Xk+z  (this  was  done,  in  a different  notation,  in  Sec.  1.1  in  the  deri- 
vation of  Simpson's  rule).  The  indicated  quadratic  function  is  con- 
veniently sought  in  the  form 

y = a + bs  + — A)  (3) 

By  the  hypothesis, 

yb  = a,  yfc+1  = a -f-  bh,  yb+2  = a + b2h  + c2h2 

whence 

*y*+i/2  = yk+ 1 — y*  = bh , 8y*+3/2  = 2 — jy*+ 1 = 2 ch2  + bh , 

&2yk+ 1 = ^yk+'bji  — *y*+i/2  — 2c  h2 

Now  expressing  a,  b , c in  these  terms  and  substituting  into  (3),  we 
obtain  Newton's  formula  for  quadratic  interpolation: 

y=v.+Svwf+*^i(i-l)  («> 


As  has  already  been  mentioned,  this  formula  may  be  used  also  for 
extrapolation.  Equation  (4)  is  not  quite  symmetric:  we  have  used  the 
values  yb,  yk+x  and  y*+2  where  as  * lies  between  xk  and  xk+i.  If 
we  reverse  the  direction  of  the  *-axis  and  utilize  the  values  yk+\> 
yb  and  1 in  similar  fashion,  then  in  place  of  (4)  we  get 


V = y*+i  + (—  Sva-i-i/o) 


h-  s §2yk  h - s (h  - 3 


h 


(^-■1 


(5) 


From  the  more  exact  equation  (4)  it  is  readily  seen  that  for  small  h the  error 
in  a linear  interpolation  is  of  the  order  of  Ji2,  since  the  second-order  differences 
are  of  that  order. 
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Fig.  5 


Fig.  6 


which  is  nonsymmetric  in  the  other  direction.  Now  taking  the  half-sum 
of  the  right  members  of  (4)  and  (5),  we  obtain  the  symmetric  Bessel 
formula : 


= y k 4-  n-+i  . 


2 J ' 4 A 


which  possesses  a high  degree  of  accuracy.  We  leave  it  to  the  reader 
to  transform  the  Newton  and  Bessel  formulas  so  that  y is  expressed 
directly  in  terms  of  the  “mesh-point”  values  yk  and  not  in  terms  of 
their  differences. 

In  that  manner,  it  is  possible  to  derive  interpolation  formulas 
of  still  higher  degree.  However,  since  the  accuracy  of  experimental 
findings  is  limited  — and  the  same  can  be  said  of  tabulated  values 
of  functions  — the  use  of  differences  of  too  high  an  order  is  ordi- 
narily not  justified.  In  most  cases  (with  the  exception  of  meas- 
urements and  calculations  carried  out  to  particularly  high  degrees 
of  accuracy),  second-order  and  even  first-order  differences  suffice. 

If  the  function  at  hand  is  discontinuous,  the  interpolation  may 
be  carried  out  only  over  intervals  that  do  not  contain  points  of 
discontinuity.  If  this  fact  is  lost  sight  of,  the  interpolation  may  give 
a distorted  picture  of  the  true  behaviour  of  the  function.  For  in- 
stance, in  Fig.  5 we  have  the  internal  energy  of  a unit  mass  of 
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water  at  normal  pressure  shown  as  a function  of  the  temperature.* 
This  function  has  discontinuities  at  the  temperatures  of  phase  transi- 
tions, that  is,  at  the  freezing  and  boiling  points  of  water.  The  dash- 
ed line  in  the  figure  shows  the  result  of  quadratic  interpolation 
when  the  chosen  mesh  points  correspond  to  distinct  phases;  it  is 
quite  clear  that  this  interpolation  distorts  the  actual  picture.  A si- 
milar distortion  results  in  the  case  of  a discontinuity  of  the  deri- 
vative (a  salient  point  on  the  curve)  of  the  function  under  study. 
The  dashed  line  in  Fig.  6 shows  the  result  of  a linear  interpolation 
in  the  case  of  a “cuspidal”  maximum.  The  error  near  this  maximum 
stands  out  clearly. 

Exercises 

1.  Let  y0  = 1.00,  yx  = 1.25,  y2  — 1.65,  y3  = 2.34.  Find  y3I2  using 
formulas  (2),  (4),  (5)  and  the  Bessel  formula. 

2.  Verify  the  fact  that  if  a table  has  been  compiled  for  an  nt\i 
degree  polynomial,  then  nt h order  differences  are  constant  and 
differences  of  order  n + 1 are  equal  to  zero. 

2.2  Integration  and  differentiation  of  tabulated 
functions 

The  integration  of  functions  given  in  tabular  form  presents  no 
special  interest.  Earlier,  in  the  numerical  integration  of  functions  spe- 
cified by  formulas  we  first  set  up  a table  of  the  integrand  (see 
Sec.  1.1). 

When  computing  the  derivative  of  a tabulated  function,  bear  in 
mind  that  the  best  way  to  find  the  derivative  y'(x)  from  two  values 
of  the  function  is  to  take  the  values  on  the  right  and  on  the  left 
at  equal  distances  from  that  value  of  # for  which  we  want  to  deter- 
mine the  derivative: 


/(*) 


i ( M 

y\*  + — 

2 

' { 2) 

h 


Thus,  if  we  have  values  of  y for  values  of  x at  equally  spaced 
intervals  (that  is,  in  arithmetic  progression),  then  it  is  convenient  to 
compute  the  derivatives  at  the  midpoints  of  the  intervals.  In  other 
words,  if  the  values  of  y were  specified  for  “integral”  values  of  x (see 


More  frequently,  use  is  made  of  the  so-called  enthalpy  (heat  content ) H = E -f-  pV  % 
where  p is  the  pressure  and  V is  the  volume.  The  heat  capacity  at  constant 
temperature  is  precisely  equal  to  the  derivative  of  H with  respect  to  0. 
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Sec.  2.1),  then  the  values  of  yf  will  be  computed  for  “half-integral” 
values  of  x.  The  same  procedure  is  used  to  find  the  derivative  of  y', 
i.e.  y”,  from  the  values  of  y'.  Here,  the  values  of  y"  will  again  be 
found  for  integral  values  of  x.  The  derivative  y"  may  be  expressed 
directfy  in  terms  of  y : 


y'w 


y(x  4-  h)  - y(x)  __  y(x)  - y(x  - k) 

h h 

h 


y(x  + h)  — 2 y(x)  + y(x  — k) 
h2 


The  formulas  given  here  can  be  conveniently  written  down  in 
the  notation  of  Sec.  2.1: 

I y(*k  + h)  -y(xk)  __  Sy*+i/2 

y*+u 2 = y \xk+\ii)  — y l % + — I ~ — — - — 

this  equation  is  justification  of  the  fact  that  we  referred  the  differ- 
ence yk+ 1 — yk  to  the  value  x = xk+\}2- 
Similarly 


Here  is  an  example  (Table  2,  in  which  half-integral  values  of  the 
argument  are  not  tabulated). 


Table  2 


x y yf  y" 


1.00 

1.10 

1.20 

1.30 

1.40 

1.50 


1.6487 

1.7333 

1.8221 

1.9155 

2.0138 

2.1170 


0.846 

0.888 

0.934 

0.983 

1.032 


0,42 

0.46 

0.49 

0.49 


Values  of  y‘  for  other  values  of  x may  be  obtained  by  inter- 
polation. For  instance,  for  integral  x we  obtain,  via  interpolation. 


y*  ■ 


1 — (yi-1/2  + y'k+v  2) 


i? 

1 

L 

yk+i  - yk 

l * " 

h h 

)- 


yie+i  - yk- 1 


2 h 
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Thus,  we  can  determine  at  once  the  values  of  y'  for  integral  * via 
the  values  of  y for  integral  # on  the  right  and  on  the  left  (see 
Table  3). 

Table  3 


X 

y 

y 

1.00 

1.6487 

1.10 

1.7333 

0.867 

1.20 

1.8221 

0.911 

1.30 

1.9155 

0.960 

1.40 

2.0138 

1.002 

1.50 

2.1170 

However,  using  this  method  we  obtain,  firstly,  the  values  of  the 
derivatives  less  than  in  the  first  method  by  one  and,  secondly,  less 
information  on  the  behaviour  of  the  derivative  at  the  endpoints. 
Say,  in  Table  2 we  know  the  derivative  for  x — 1.05  (near  the  be- 
ginning of  the  interval  x — 1)  and  for  x — 1.45  (near  the  end  of  the 
interval  x = 1.5),  but  in  Table  3 only  for  x = 1.1  and  for  x = 1.4, 
which  is  to  say,  at  values  of  the  argument  farther  away  from  the 
endpoints  of  the  interval.  Finally,  when  computing  values  of  yr  for 
nonintegral  (say  half-integral)  values  of  x via  interpolation,  the  va- 
lues obtained  from  Table  2 turn  out  to  be  more  reliable  than  those 
obtained  from  Table  3,  since  the  inclination  of  a curve  is  more  accu- 
rately reproduced  by  small  chords  than  by  large  chords.  The  first 
method  is  therefore  preferable. 

Very  important,  fundamental  questions  arise  in  connection  with 
the  restricted  accuracy  and  errors  inherent  in  every  measurement. 

In  computing  an  integral,  each  separate  measured  value  of  y 
is  multiplied  by  the  quantity  Ax.  Therefore  as  the  number  of  se- 
parate measured  values  of  the  function  y is  increased,  the  coefficient 
with  which  each  separate  value  enters  the  expression  of  the  integral 
diminishes  in  inverse  proportion  to  the  number  of  subintervals  Ax . 

Consequently,  there  is  also  a reduction  in  the  error  in  the  inte- 
gral that  is  due  to  the  errors  in  each  separate  measurement  of  the 
quantity  y . 

In  computing  a derivative,  the  difference  between  two  values 
of  y is  divided  by  Ax.  The  smaller  the  subinterval  Ax , that  is  the 
smaller  the  denominator,  the  larger  the  error  contributed  to  the  de- 
rivative by  the  given  error  in  each  measured  value  of  y.  For  this 
reason,  the  derivative  of  a function  specified  by  experimental  values 
proves  to  be  known  with  an  accuracy  less  than  that  of  the  function 
itself. 
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Fig.  7 


Fig.  8 


We  illustrate  the  difference  between  differentiation  and  integra- 
tion with  an  example. 

Fig.  7 depicts  two  curves,  one  solid,  the  other  dashed.  The  solid 
curve  is  the  graph  of  the  function  y = x — 0.1  x2,  the  dashed 
curve  is  the  graph  of  the  function  yx  = # — 0.  lx2  + 0.5^“8(A'”3)2. 
From  the  figure  it  is  clear  that  one  curve  differs  perceptibly  from 
the  other  only  over  a small  range  of  x. 
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Fig.  9 


Fig.  10 


% * 


X 


Fig.  8 shows  the  graphs  of  I(x)  dx  and  Ix{x)  — ^yxdx 


o o 

(dashed  curve).  We  see  that  the  difference  between  the  curves  y(x) 
and  yx(x)  produces  a slight  addition  to  the  integral  Ix{x)t  which  the 
graph  is  only  apparent  for  x > 2.8.  On  the  whole,  there  is  only  a 
slight  difference  between  the  curves  I(x)  and  Ix{x). 

Fig.  9 depicts  the  curves  of  the  derivatives  y'(x)  and  y[(x)t  We 
see  that  a slight  change  in  the  function  over  a small  interval  gave 
rise  (over  the  same  interval)  to  large  changes  in  the  derivative.  The 
second  derivatives  exhibit  a still  greater  difference.  Their  graphs 
are  shown  in  Fig.  10,  where  the  scale  on  the  jy-axis  is  half  that  of 
Figs.  7 to  9. 
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When  a curve  is  obtained  experimentally,  a slight  variation  in 

the  curve  over  some  interval  may  be  the  result  of  an  error  in  the 

experiment.  It  is  clear  from  the  preceding  example  that  such  isolated 
errors  do  not  substantially  affect  the  magnitude  of  the  integral,  but 
they  strongly  affect  the  value  of  the  derivative  (particularly  higher 
derivatives). 

In  order  to  obtain  reliable  values  of  the  derivative,  it  is  first 
necessary  to  find  a formula  that  gives  a good  description  of  the 
experimental  data  and  then  find  the  derivative  using  this  formula. 

Since  the  formula  is  constructed  on  the  basis  of  all  experimen- 
tal data,  the  value  of  the  derivative,  for  each  value  of  x,  will 

be  found  from  the  formula  only  if  all  the  data  are  taken  into 

account,  and  not  only  two  or  three  close-lying  values.  It  is  therefore 
natural  to  expect  that  random  errors  in  certain  measurements  will 
have  a smaller  effect  on  the  value  of  the  derivative. 

Choosing  a formula  to  describe  the  results  of  an  experiment  is, 
generally,  an  essential  part  of  the  treatment  of  experimental  findings. 
This  problem  of  curve  fitting  on  the  basis  of  experimental  data  is 
discussed  in  the  next  two  sections. 

Exercises 

Use  the  conditions  of  Exercise  2,  Sec.  1,  to  compute  the  values 
of  y ' for  half-integral  values  of  assuming  A x = h = 0.5.  Inter- 
polate the  result  linearly  and  also  use  formulas  (4)  and  (5)  for 
integral  values  of  x . 

2.3  Fitting  of  experimental  data  by  the  least-squares 
method 

The  fitting  of  formulas  to  experimental  findings  is  called  curve 
fitting.  Actually,  of  course,  a formula  is  the  better,  the  more  theo- 
retical material  has  been  put  into  it  and  the  less  empirical  it  is.  In 
reality,  one  first  has  to  specify  the  type  of  formula,  and  then,  using 
the  results  of  experiments,  to  determine  the  values  of  the  various 
constants  in  the  formula. 

Before  attempting  to  fit  a formula,  it  is  useful  to  plot  the  ex- 
perimental findings  on  a graph  and  then  draw  freehand  the  most 
likely  curve  through  the  points  obtained.  In  doing  so,  we  will  imme- 
diately perceive  those  findings  that  are  particularly  suspect  as  being 
erroneous.  In  drawing  the  curve,  it  is  very  important,  besides  using 
the  experimentally  found  points,  to  reason  generally  about  how 
the  curve  should  behave  at  values  of  the  argument  very  close 
to  zero,  at  large  values  of  the  argument,  whether  the  curve  passes 
through  the  origin  of  coordinates,  or  intersects  the  coordinate  axes, 
or  is  tangent  to  them,  etc. 
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Now  suppose  all  this  preliminary  work  has  been  carried  out  and 
a type  of  equation  has  been  chosen ; we  need  to  determine  the  values 
of  the  constants  that  enter  into  this  equation. 

How  is  this  done? 

Let  us  consider  the  simplest  case. 

Suppose  that  y is  proportional  to  x,  that  is,  we  seek  a formula 
of  the  form  y = kx.  The  problem  reduces  to  determining  the  coeffi- 
cient k . Each  experiment  yields  a specific  value  for  k , namely, 


where  x„,  yn  are  the  values  of  the  quantities  x and  y obtained  in  the 
nth  experiment.  The  index  n on  k indicates  the  value  correspond- 
ing to  the  nth  experiment.  We  can  take  the  mean  of  the  values  knt 
putting 

p 


p 

where  p is  the  total  number  of  experiments.  We  obtain  the  formula 
y = kx. 

Observe  that  this  is  the  simplest  but  not  the  best  procedure  for 
choosing  k . Indeed,  let  x be  a quantity  that  we  specify  exactly  and 
that  describes  the  conditions  of  the  experiment ; let  y be  the  result 
of  an  experiment  that  contains  a certain  error  of  measurement.  We 
will  assume  that  the  measurement  error  Ay  is  roughly  the  same  for 
both  small  and  large  values  of  y.  Then  the  error  in  the  quantity  kn , 

equal  to  , is  the  larger,  the  smaller  xn  is.  Hence,  in  determining 
xn 

the  quantity  k,  it  is  best  to  rely  on  experiments  with  large  xn. 

We  now  pose  the  problem  of  finding  the  value  of  k for  which 
the  function  y = kx  fits  the  experimental  data  in  the  best  way. 
(The  meaning  of  this  rather  vague  word  “best’'  Will  become  clear  from 
what  follows.)  As  a measure  of  the  deviation  of  the  function  from  the 
experimental  data  in  the  nth  experiment  we  choose  the  quantity 
( y„  — kxn)2.  Why  do  we  choose  (yn  — kxn )2  and  not  yn  — kxn7  It  is 
clear  that  both  signs  of  the  deviation  of  kxn  from  yn  are  bad:  it  is 
bad  if  k is  such  that  yn  < kxn  and  it  is  also  bad  if  k is  such  that 
yn  > kxn.  If  for  the  measure  of  the  deviation  we  took  the  quantity 
yn  — kxn  and  then  sought  the  sum  of  the  deviations  in  several  expe- 
riments, we  might  obtain  a very  small  value  due  to  mutual  cancelling 
of  individual  large  terms  of  opposite  sign.  But  this  would  in  no  way 
indicate  that  the  function  y = kx  is  good.  Now  if  we  take  (yn  — 
— kxn)2  for  the  deviation,  no  such  mutual  cancelling  will  occur, 
since  all  the  quantities  (yn  — kxn)2  are  positive.  Note  that  we  could, 
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in  principle,  take  | yn  — kxn  |,  (yn—  kxn )4,  etc.  instead  of  ( yn  — kxn)2. 
But  subsequent  computations  would  be  unduely  complicated. 

As  a measure  of  the  total  error  S in  a description  of  experi- 
mental findings  via  the  function  y = kx  we  take  the  sum  of  the 
deviations  in  all  experiments,  that  is, 

s = £oy.-**J2  (6) 

«=1 


The  method  of  determining  the  constants  in  a formula  by  requiring 
that  the  total  deviation  S be  least  is  called  the  method  of  least  squares . 

Observe  that  if  one  quantity  yn  — kxn  ~ 10,  that  is,  for  some 
one  x = xn  the  formula  yields  an  error  of  10  units,  then  this  will 
introduce  100  units  into  5.  On  the  other  hand,  if  we  have  10  errors 
of  1 each,  then  there  will  be  only  10  units  in  5.  It  is  therefore 
clear  that  S is  mostly  affected  by  the  largest  errors,  whereas  small 
errors,  even  if  they  occur  frequently,  have  but  a small  influence.  The 
method  of  least  squares  is  aimed  at  reducing  the  largest  deviations. 

To  find  k = k for  which  S is  the  least,  we  solve  the  equation 

— = 0.  Using  (6),  we  get 
dk 

=2E(J«  - kxn)  (— = 0 

whence 


which  yields 


xnyn  = 0 

«=  1 n=l 


_ s x*y» 

k = k = "=1 


XiVl  + x2Vl  + - + XyVy 


P 

2 *» 
«=  1 


x\  + X\  + ...  + 


(7) 


If  in  every  experiment  we  get  yn  = kxn  exactly,  then  from  (7) 
we  obtain 


• kxx  + x2  ■ kx2  + ...  + xp  « kxp ^ 

*i  + xi  + — + *% 

If  the  quantity  k„  = — differs  in  different  experiments,  then  by 

Xn 

substituting  in  (7)  the  value  knxn  for  y„  we  obtain 


^ ^1^1  "f  4"  •••  "4  kpXp 

“ ~ xf  + *l  + ...  + x l 


(8) 


Among  the  quantities  klt  k2 , ...,  kp  obtained  in  distinct  experi- 
ments there  is  a largest  one  £max  and  a smallest  one  £min.  If  in  the 
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right  member  of  (8)  we  replace  all  kn  by  Amax,  then  the  fraction  will 
only  increase  and  we  get 


^ ^ kmzx%\  + Amax#2  “f-  “i"  ^max^  ^ 

“ “ 7 — ^max 

x\  + X | + ...  + x\ 

In  the  very  same  way  we  can  prove  that  k > km\n. 

Thus,  the  k found  from  the  minimum  condition  of  S satisfies 


the  inequalities  kmin  < k < Amax,  which  means  it  is  indeed  the  mean 
of  all  the  values  kv  k2t  ...,  kp\  but  this  mean  is  formed  by  a more 
complicated  rule  than 


k = 


k2  ...  + kp 

P 


In  (8)  each  kn  enters  into  the  numerator  with  the  factor 
which  is  called  the  weight . * 

It  is  clear  that  the  greater  the  weight  x;u  the  greater  the  effect 
the  measurement  corresponding  to  the  value  x = xn  will  have  on 
k . This  confirms  the  idea  expressed  earlier  that  measurements  with 
large  xn  are  more  important  for  a correct  determination  of  k. 

If  there  is  no  reason  to  suppose  that  y = 0 for  x = 0,  then  the 
simplest  is  the  formula  y = kx  + b.  Here  too  we  can  apply  the  me- 
thod of  least  squares.  For  this  case,  5 is  given  by 


s=Y;(y„-kxn-b)* 


n=l 


(9) 


The  aim  is  to  choose  k and  b so  that  5 is  a minimum. 

We  proceed  as  follows.  If  b were  known,  then  only  k would 
have  to  be  changed  in  the  right  member  of  (9),  and  so  it  should 
be  true  that  ** 


77  = (y„  - kx„  - b)  (~xn)  = 0 

dk  n=  1 ' 


* 


•* 


The  name  "weight"  stems  from  the  following  mechanical  analogy.  Imagine  a 
scale  with  points  spaced  at  distances  of  kY>  k2 , ....  kp  and  with  weights  at  these 
points.  If  all  weights  are  the  same,  the  centre  of  gravity  of  the  system  (we  ig- 

_ £ I ft  | <>t  | fc 

nore  the  weight  of  the  scale  itself)  lies  at  the  scale  point  k — — • 


But  if  at  point  kx  we  have  a weight  x\,  at  k2  a weight  x\,  ...,  and  at  kp  a 
weight  then  fhe  position  of  the  centre  of  gravity  is  given  by  formula  (8). 
Thus,  this  formula  corresponds  to  the  idea  of  a different  significance  and  of 
different  weights  of  separate  observations. 

When  we  consider  a function  of  several  variables,  the  derivative  with  respect 
to  one  of  the  variables,  the  others  being  held  constant,  is  denoted  by  d and  not 
d (see,  for  example  HM,  Sec.  3.12).  We  will  have  more  to  say  on  this  topic  in 


Ch.  4. 
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On  the  other  hand,  if  k were  already  known,  then 

•|j  = -2^2(yn-  kxn  -b)=0 

db  n=  1 

These  two  conditions  give  us  the  following  system  of  equations 
for  determining  the  numbers  k and  b : 

ID  x„yn  — k XD  x\  — b 2D  *»  = 0, 

n=  1 «=  1 «= 1 

ID^n  ~kY^Xn~bP  =° 

n=l  n=l 

It  is  easy  to  find  k and  b from  (10).  For  brevity  set 
p p p p 

= ID  <*2  = ID*n.  »o  = 2Dy»,  ^i  = ID^nj'» 

n= 1 «— 1 «=1  «=1 

Then  the  system  (10)  can  be  rewritten  thus: 

G2k  + Gjb  = rv 

Gxk  + pb  = t o 

Solving  it  we  obtain 

£ __  til  — roCTi  J _ rog2  - ^lgl 
pc2  — af  ' ji>a2  — 

But  this  method  can  be  viewed  from  another  angle. 

By  specifying  a linear  relationship 

y — k%  + b 

between  the  quantities  % and  y,  we  obtain  two  unknown  parame- 
ters k and  b.  By  means  of  measurements  we  arrive  at  certain  rela- 
tions between  these  parameters: 

kxx  + b = yl9  | 
kx2  + b = y2,  [ 


kxp+b  = yp 

In  other  words,  we  have  a system  of  p equations  in  two  unknowns. 
For  p > 2,  such  a system  is  overdetermined  since  it  is  sufficient  in 
principle  to  have  two  equations  in  order  to  find  the  unknowns.  But  if 
we  note  that  the  physical  quantities  x and  y are  measured  with 
a definite  error,  we  find  that  in  the  case  of  two  measurements  (when 
p = 2)  the  values  of  k and  b may  be  essentially  affected  by  random 
errors  of  measurement  and  so  the  accuracy  of  the  result  remains 
obscure.  For  this  reason,  it  is  dangerous  to  reduce  the  number  of 
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equations  containing  such  random  factors.  On  the  contrary,  the  larger 
the  number  of  measurements,  that  is,  the  more  overdetermined  the 
system,  the  better,  because  then  the  random  errors  of  individual 
measurements  cancel  out  and  the  solution  found  by  the  method  of 
least  squares  becomes  more  reliable. 

It  is  not  difficult  to  generalize  the  method  of  least  squares  to 
the  case  of  more  complicated  relationships  between  the  quantities 
x and  y.  It  is  well  to  bear  in  mind,  however,  that  the  method  of 
least  squares  frequently  leads  to  rather  unwieldy  computations. 
When  the  desired  parameters  appear  nonlinearly  in  the  relations, 
the  method  leads  to  a system  of  nonlinear  equations  with  a conco- 
mitant extreme  buildup  in  computational  difficulty.  It  is  for  this 
reason  that  in  practice  graphical  methods  of  curve  fitting  have 
proved  more  effective.  We  shall  consider  them  in  the  next 
section. 

Exercises 

1.  Using  the  method  of  least  squares,  choose  a formula  of  the  type 
y — kx  based  on  the  following  experimental  findings : 

(a) 


X 

0.25 

0.50 

0.75 

1.00 

1.25 

1.50 

1.75 

2.00 

y 

0.40 

0.50 

0.90 

| 1.28 

1.60 

1.66 

2.02 

2.40 

(b) 


X 

0.25 

0.50 

0.75 

1.00 

1.25 

y 

0.16 

0.18 

0.80 

0.60 

1.08 

(c) 


X 

0.4 

0.8  1 

1.2 

1.6 

2.0  , 

y 

0.69 

1.44  | 

2.08 

2.74 

3.52 

In  each  of  the  foregoing  three  cases,  plot  the  tabulated  points 
and  the  straight  line  obtained  by  the  method  of  least  squares. 
(Use  cross-section  paper). 

2.  Using  the  following  table,  choose  the  numbers  k and  b for  the 
formula  y = kx  + b by  the  method  of  least  squares : 


X 

-0.20 

0.20 

0.40 

0.60 

0.70 

0.80 

y 

0.96 

1.40 

1.56 

1.74 

1.92 

2.04 

Sec.  2 A 


55 


Curve  fitting 

3.  Given  two  points  (xlt  y±)  and  (x2,  y2).  Use  them  to  choose  the 
numbers  k and  b for  the  equation  of  the  straight  line  y — kx  + b 
via  the  method  of  least  squares.  Show  that  in  doing  so  we 
obtain  an  absolutely  exact  result,  that  is,  we  get  the  equation 
of  the  straight  line  passing  through  the  two  indicated  points. 

2.4  The  graphical  method  of  curve  fitting 

Recall  that  the  equation  of  a straight  line  has  the  form  y = kx  -f- 
+ b , where  the  numbers  k and  b have  a simple  geometric  meaning 
(see  HM,  Sec.  1.4):  b is  the  line  segment  intercepted  on  the  y-axis 
and  k is  the  slope  of  the  straight  line  to  the  *-axis  (Fig.  11). 

Let  us  assume  that  y and  x arc  related  linearly,  i.e.  y = kx  + b . 
Plot  the  experimental  points  on  the  graph.  Now  put  a transparent 
rule  on  the  graph  and  move  it  to  obtain  the  straight  line  closest  to 
the  experimental  points  (Fig.  12). 

By  drawing  that  line  we  obtain  b and  k = YjX  from  the  figure. 
The  big  advantage  of  the  graphical  method  is  its  pictorial  na- 
ture. If  the  experimental  points  lie  on  the  line,  with  the  exception 
of  a few  that  fall  outside  the  line,  these  points  are  clearly  exhibited 
and  we  can  see  what  points  have  to  be  verified.  If  on  the  whole 
the  experimental  points  do  not  lie  on  a straight  line,  this  too  is 
clear  from  the  graph.  In  this  case  the  relationship  between  x andy 
is  more  complicated  than  y = kx  + b.  An  added  advantage  of 
the  graphical  method  is  that  no  messy  calculations  like  those  in- 
volved in  the  method  of  least  squares  are  needed  and  the  chances  of 
an  accidental  calculation  error  are  reduced. 

The  straight  line  occupies  an  exceptional  place  in  the  graphical 
method  of  curve  fitting.  No  other  line  can  be  drawn  so  simply  and 
reliably  through  the  given  points.  Anyone  who  has  ever  made  a prac- 
tical comparison  in  determining  the  numbers  k and  b in  the  equation 
of  a straight  line  on  a graph  by  the  least-squares  method  knows 
that  the  difference  is  always  very  slight. 

How  does  one  go  about  choosing  the  constants  of  a formula 
with  the  aid  of  the  graph  if  the  formula  is  more  complicated  than 
y = kx  + b? 

Let  us  consider  an  example. 

Suppose  we  are  investigating  the  relationship  between  the  tem- 
perature T of  a wire  and  the  direct  current  i flowing  in  the  wire. 
It  is  clear  that  a change  in  the  direction  of  current  flow  does  not 
change  T,  that  is,  T(—i)  = T(i).  Therefore  a relationship  like  T = 
= ai  -f-  b is  not  suitable.  We  will  seek  a formula  like  T = ai 2 + b. 
The  graph  of  the  function  T(i)  is  a parabola  and  it  is  not  easy  to 
draw  a parabola  by  eye.  We  therefore  introduce  a new  variable 
z ~ i 2.  Then  T = az  + b so  that  in  terms  of  the  coordinates  z,  T 
the  desired  relationship  is  given  by  a straight  line.  The  temperature 
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Fig.  11 
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Fig.  12 


b = T0  in  the  absence  of  current  may  be  taken  as  known,  so  that 
it  remains  to  determine  the  coefficient  a of  i 2.  , 

For  heavy  currents  with  accompanying  high  temperatures,  the 
resistance  of  the  wire  cannot  be  assumed  constant.  For  this  reason, 
the  heating  capacity  (the  amount  of  heat  released  per  unit  time), 
which  is  equal  to  W = Ri 2,  is  not  in  reality  merely  proportional 
to  i2  since  R varies.  In  the  equation  of  the  thermal  system 

W = Ri 2 = a S(T  - T0) 

where  a is  the  heat  transfer  coefficient  and  S is  the  surface  of  the 
wire,  the  coefficient  a is  also  variable  at  high  temperatures.  However 
we  still  have  an  equality  of  temperatures  for  currents  i and  —i.  It 
is  therefore  natural  to  insert  into  the  formula  T — at2  + b , which 
may  now  prove  to  be  inexact,  the  term  cz4  instead  of  ci3. 


Sec.  2.4 
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And  so  we  seek  the  formula  in  the  form  T = ci 4 + ai2  + b. 
Note  that  T ~ b for  i = 0 so  that  b does  not  differ  from  the 
ambient  temperature  and  is  therefore  known  (see  above).  We  rewrite 
the  formula  thus: 


Introducing  new  variables  x = i2,  y = , we  obtain 

z2 

y = cx  -f-  a,  which  means  * and  y are  linearly  related.  It  is  easy 
to  determine  a and  c by  constructing  a graph  in  terms  of  the  coor- 
dinates x and  y. 

Thus,  the  general  idea  behind  the  graphical  method  is  to  intro- 
duce new  variables  so  that  the  desired  relationship  in  terms  of  these 
variables  is  linear. 

Here  are  some  other  examples. 

Frequently,  the  relationship  between  x and  y is  such  that  it 
is  known  for  sure  that  y must  be  zero  when  x = 0,  but  the  experi- 
mental findings  on  the  graph  do  not  lie  on  a straight  line.  The 
following  formula  may  then  be  true: 

y = ax  + bx2 

Divide  all  terms  by  x to  get  yjx  = a + bx.  Putting  yjx  = z,  we  ob- 
tain z as  a linear  function  of  x: 

z = a + bx 

Another  formula  that  may  come  in  handy  in  this  case  is  y = axn . 
The  question  is  how  to  determine  the  exponent  n.  To  do  this  we 
take  logs  of  both  sides  of  the  equation : 

log  y = n log  x + log  a 

Introducing  new  variables  z = log  y,  t — log  x,  we  obtain  the  lin- 
ear relationship 

2 — log  a + nt 

The  law  o f radioactive  decay  is  given  by  the  equation  n — nQ  e~ 
where  n is  the  number  of  atoms  that  have  not  yet  decayed  at  time 
t , n0  is  the  total  number  of  atoms,  and  co  is  the  probability  of  decay. 
Taking  logs  of  both  sides,  we  get 

In  n = In  n0  — co t 

We  thus  obtain  a straight  line  in  terms  of  the  coordinates  t,  y = In  n . 
{Radioactive  decay  is  discussed  in  detail  in  HM,  Sec.  5.3.) 

Investigations  of  some  quantity  x as  a function  of  tempera- 
ture T very  often  yield  a formula  like 


x = Be 


A_ 

IT 
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Such  a formula  results  in  cases*  where  a contribution  is  made  only 
by  those  molecules  (or  electrons)  whose  energies  exceed  A,  k is  the 
Boltzmann  constant  (A  = l A X 10“ 16  erg/deg). 

A 1 

Taking  logs,  we  get  In  x = In  B . The  relationship  be- 

k T 

comes  linear  if  we  consider  the  quantities  y = — and  z = In  x. 

Indeed,  z — In  B — — y. 

k 

In  the  foregoing  examples  we  chose  a type  of  formula  and 
then  introduced  new  variables  so  as  to  make  the  relationship  between 
the  new  variables  linear.  However,  it  may  happen  that  the  experi- 
mental points  in  the  new  variables  do  not  lie  on  a straight  line, 
which  means  that  the  type  of  formula  was  inaptly  chosen  and  that 
a different  type  must  be  sought. 

Suppose  a series  of  experiments  have  been  carried  out  in  which 
for  values  of  the  argument  x1}  x2,  xp  we  have  obtained  the  function 
values  ylf  y2,  jy  Let  the  values  of  the  argument  be  arranged  in 
an  increasing  sequence  xx  < x2  < ...  < xp.  Determining  an  expected 
experimental  value  of  y for  a given  value  of  x lying  inside  the  in- 
vestigated range  of  the  argument  (xx  < x < xp)  constitutes  an  inter- 
polation problem  (cf.  the  beginning  of  Sec.  2.1). 

Interpolation  is  a simple  matter  if  an  empirical  formula  has  been 
found.  And  if  the  formula  is  a suitable  one,  the  interpolation  ordi- 
narily yields  good  results  and  rarely  leads  to  crude  errors.  A much 
more  difficult  problem  is  this:  what  value  of  y is  to  be  expected 
from  experiment  for  a certain  value  of  x lying  outside  the  experi- 
mental range  of  the  argument,  for  example,  when  x > xp.  Deter- 
mining such  a value  from  experiment  constitutes  an  extrapolation 
problem. 

The  solution  of  extrapolation  problems  requires  a careful  study 
of  the  matter  in  each  specific  instance.  Such  problems  cannot  be 
solved  in  a formal  manner  merely  by  using  a selected  formula. 
(This  idea  was  put  very  neatly  by  Kretya  Patachkuvna  in  the  book 
Pi,  when  she  said:  “It  is  very  hard  to  foresee  anything,  particularly 
if  it's  in  the  future/') 

For  example,  if  experimental  findings  yield  a formula  of  the 
type  y = a + bx  -j-  cx 2 + px 3,  and  this  formula  gives  a good  descrip- 
tion of  experimental  results,  then  as  a rule  the  terms  cx2,  pxz  are 
introduced  to  describe  the  deviation  of  experimental  points  from  the 
straight  line  over  the  range  of  x where  the  measurements  were  made. 
In  this  case,  the  terms  cx2  and  pxz  are  ordinarily  in  the  form  of 
small  corrections  to  the  principal  term  a + bx. 


Several  such  cases  are  discussed  in  HM,  Ch.  7. 


Sec.  2.4  Curve  fitting  59 

Now  if  we  use  such  a formula  and  extrapolate  y to  values  of  x 
that  are  far  outside  the  experimental  range,  then  the  terms  cx2 
and  px3  will  become  dominant,  yet  this  might  not  at  all  fit  the 
essence  of  the  matter.  This  situation  is  reminiscent  of  an  Andersen 
fairy  tale  where  a shadow  separates  itself  from  the  man  and  begins 
an  existence  all  its  own,  launching  out  on  a career  and  finally 
forcing  the  man  himself  into  submission. 

If  as  % increases  without  bound  y approaches  a definite  value 
y oo,  then  it  is  useful  to  find  that  value.  This  problem  is  called 
extrapolation  to  infinity . In  the  solution  of  this  problem  it  is  fre- 
quently useful  to  introduce  a new  independent  variable  z that  re- 
mains finite  for  x = oo,  say,  z = \\x  . After  such  a transition,  the 
interval  (in  z)  over  which  the  extrapolation  is  performed  will  then 
be  finite. 

Let  us  consider  an  example. 

Suppose  we  have  a long  strip  of  explosive.  At  one  end  we  set 
off  an  explosion  that  is  propagated  lengthwise  along  the  strip.  It  is 
clear  that  if  the  strip  is  very  long,  its  action  on  some  obstacle  will 
cease  to  be  dependent  on  the  length  of  the  strip.  This  is  quite  evident 
since  if  we  increase  the  length  of  the  strip  sufficiently,  we  thus  also 
increase  the  quantity  of  explosive  which  is  at  some  distance  from 
the  obstacle  and  which  for  this  reason  makes  hardly  any  contribu- 
tion at  all.  Suppose  for  example  we  denote  by  y the  maximum 
thickness  of  a steel  wall  to  be  destroyed  by  an  explosive  charge  of 
length  x . Experimental  findings  have  been  plotted  in  Fig.  13.  We 
can  see  here  that  as  x increases,  y approaches  a definite  value,  v®. 
But  one  cannot  find  that  value  from  the  graph. 

The  question  is  how  to  find  it.  Suppose  that  for  large  x the 
formula  is  of  the  form  y = y®  — a/x.  Introducing  a new  variable 
z ~ 1/x,  we  get  y = y®  — a • z.  Now  y®  corresponds  to  z = 0. 
Constructing  the  experimental  findings  in  terms  of  the  coordinates 
zt  y (in  Figs.  13  and  14  the  numbers  in  brackets  label  the  same 
points),  we  can  determine  by  eye  the  presumed  value  of  y for  z = 0 
(Fig.  14). 

The  formula  y = y®  — a/x  holds  only  for  sufficiently  large  x. 
For  x = a/y ® it  gives  y = 0 and  if  x < a/y®  we  even  get  y < 0, 
which  is  completely  meaningless.  Therefore,  the  points  in  Fig.  14 
obtained  experimentally  lie  on  a curve  and  not  on  a straight  line; 
however,  in  terms  of  the  coordinates  z and  y we  can  extrapolate 
this  curve  to  * = 0,  which  corresponds  to  x = oo. 

From  the  physical  meaning  of  the  problem  it  is  also  clear  that 
y must  be  0 at  x = 0.  The  simplest  formula  reflecting  both  these 
properties  of  the  function  y (y  = 0 for  x = 0 and  y approaches  the 


value  without  bound  if  x increases  without  bound)  is  of  the  form 

(") 

x + b 

How  can  we  determine  the  constants  jy®  and  b by  using  ex- 
perimental findings? 

First  of  all,  rewrite  the  formula  as  ^ 

1 AT  + b 


~ = — + ~ 
y y oo  Vo. 


We  can  therefore  hope  that  if  we  carry  out  the  construction  on  a 
graph  of  1 ly  versus  1 jx,  we  will  get  a straight  line  and  will  be  able 
to  determine  l/y*  and  bjy Even  if  the  points  do  not  fit  the  straight 
line  closely,  the  extrapolation  on  this  graph  is  more  reliable  than  on 
the  graph  of  Fig.  14  since  formula  (1 1)  was  constructed  with  account 
taken  of  two  properties  (instead  of  the  earlier  one  property)  of  the 
function  y(x). 
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Exercises 

1.  Using  the  data  of  the  following  table,  find  a formula  of  the 
form  y = ax 2 -f-  b : 


X 

0.5 

1.0 

1.5 

2.0 

2.5 

3.0 

y 

1.20 

1.10 

2.35 

3.05 

4.40 

5.50 

Solve  this  problem  in  two  ways: 

(a)  via  the  method  of  least  squares,  and 

(b)  graphically. 

2.  Find  graphically  a formula  of  the  form  y = ax2  + bx  if  the 
experimental  findings  are: 


X 

0 

0.5 

1.0 

1.5 

2.0 

2.5  3.0 

y 

0 

1.7 

3.1 

1 

3.8 

3.9 

3.8  3.0 

3.  Use  the  tabulated  data 


X 

l 

2 

3 

4 

y 

0.5 

1.4 

2.5 

4 

to  find  a formula  of  the  form  y — Axb.  (Use  the  graphical 
method.) 

4.  Measurements  yield  the  following  data: 


X 

0.5 

1.0 

1.5 

2.0 

2.55 

3.0 

3.5 

4.0 

y 

1.66 

1.58 

1.50 

1.44 

1.37 

1.30 

1.22 

1.17 

Besides,  we  know  that: 

(a)  y approaches  zero  as  x increases  without  bound ; 

(b)  y has  a very  definite  value  at  x = 0.  These  conditions  are 
satisfied,  for  instance,  by  the  following  simple  formulas: 

y = — 1 — and  y = Ae~kx 

Choose  values  for  the  parameters  of  these  formulas.  Using 
the  formulas,  obtain  the  values  of  y for  x = 1 .25,  x = 3.75, 
* = —1,  * = —2,  x = —3. 

Compare  the  results. 
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5.  The  following  data  have  been  obtained  from  experiment: 


X 

1.0 

1.5 

2.0  | 

3.0 

4.0 

5.0 

y 

1.6 

1.7  ! 

2.0 

2.3 

2.4 

2.5 

Furthermore,  it  is  known  that  y approaches  a value  y « as  x 
increases  without  bound.  Find  this  limiting  value  in  two  ways: 

(a)  by  choosing  a formula  of  the  form  y = A + 

* 

(b)  by  choosing  a formula  like  y = ax  • 

x + b 


ANSWERS  AND  SOLUTIONS 


Sec.  2.1 

1.  By  formula  (2),  y3/2=1.45;  by  (4),  y3/2  = 1.41;  by  (5), 
y3/2  = 1.43;  by  the  most  accurate  formula  (Bessel's)  y3/ 2 = 1.42. 

2.  If  y{  = y \x=sXi  = x”  and  Ax  = h , then  by  the  binomial  theorem, 

tyi+1/2  = Sy  | jrfn-*/2  = (xt  + h)n  — x* 

= nhxT 1 + ”(n  ~ 1}  h2xn~2  + ... 


Setting  xt  + hjl  = x'it  we  get 


*y  1 


x — x[ 
l 


- b /.2 

2 


A 

2 


Removing  the  brackets  in  the  right  member,  we  see  that  in  this 
example  the  differences  form  a sequence  of  values  of  a polynomial 
of  degree  n — 1.  The  same  will  hold  true  for  the  function  y — axn 
(a  = constant)  since  in  the  formation  of  differences  the  coefficient 
a is  a common  factor.  Observing  that  when  functions  are  added, 
their  differences  are  also  added,  we  conclude  that  for  any  polyno- 
mial of  degree  n the  differences  form  a sequence  of  values  of  some 
polynomial  of  degree  n — 1.  Which  means  the  second  differences 
form  a sequence  of  values  of  a polynomial  of  degree  n — 2,  and 
so  on,  while  the  fi% Ji  differences  constitute  a sequence  of  values 
of  a zero-degree  jjblynomial,  or  constants.  For  this  reason,  the 
differences  of  ofdef  n + 1 are  here  equal  to  zero. 

Sec.  2.2 

JVi/2  ~ 0.50,  y'3/2  ^ OJG,  y'6l2,  = 1.38.  By  linear  interpolation, 
y'i  = 0.65,  y2  = 1.09.  bV  formula  (4),  y\  = 0.62. 

By  formula  (5),  y'2  = 1.06. 


I 


Answers  and  solutions 
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Sec.  2.3 

1.  (a)  y ~ 1.18%,  (b)  y = 0.78%,  (c)  y = 1.75%. 

2.  y=  1.03%-  1.19. 

3.  The  equation  of  a straight  line  is  of  the  form  y = kx  + b. 

The  numbers  k and  b are  found  from  the  conditions  y = yx  for 
% = %x  and  y = y2  for  % = %2.  We  obtain  the  system  of  equa- 
tions 

kxx  + b = yv  ) 
kx2  + b = y2  j 
from  which  we  find 

& = b = ?'y*-*^  (12) 

xl  — *2  Xl~  X2 

We  will  show  that  the  values  of  k and  b are  obtained  by  using 
the  method  of  least  squares. 

In  our  case 

5 = (yx  — kxx  — b)2  + (y2  — kx2  — b )2 
Therefore 

“ = — 2[CVi  — kxx  — b)xx  + (v2  — kx2  — b)x2] 

ok  ft=constant 

= —2l(yi  — kxx  — b)  + (v2  — kx2  — 6)] 

00  A=constant 

Equating  these  derivatives  to  zero,  we  get  the  following 
system  of  equations: 

(y  — kxx  — b)  xx  + (y2  — kx2  — b)  %2  = 0,  i 
yx  — kxx  — b + y2  — kx2  - 5 = 0 I 

Solving  this  system  yields  the  same  values  as  in  (12). 
Incidentally,  the  coincidence  of  results  may  be  obtained  from 
the  simple  argument  that  one  straight  line  can  be  drawn 
between  any  two  points. 

Sec.  2.4 

1.  Form  a sum  in  order  to  take  advantage  of  the  method  of  least 
squares. 

5 = J2  (yt  — axl  - b )2 

k=\ 

Proceeding  further  in  the  usual  way,  we  find  a = 0.48,  b = 1.23, 
that  is,  the  desired  formula  is 

y = 0.48%2  + 1.23 
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To  solve  this  problem  graphically,  we  introduce  a new  variable 
t ~ x 2;  theny  = at  + b.  In  terms  of  the  coordinates  (/,  y)  we 
obtain  a straight  line.  Plotting  the  points  (tk  = x\y  yk)  on  the 
graph,  we  find  a = 0.49,  b = 1.35,  or  y = 0.4 9x2  + 1.35. 

2.  Set  — = z,  then  z = ax  + b.  Plotting  points  in  terms  of  the 

x 

coordinates  (x,  z),  we  find  a =—1.02,  b = 4,  so  that  y = 
= — 1.02*2  -f-  4x. 

3.  In  this  case,  lay  off  log  x and  log  y along  the  coordinate  axes 
to  obtain  A = 0.5,  b ~ 1.5,  y = 0.5^15. 

4.  Using  the  graphical  method,  we  find  y = , y = 

r J 0.56  + 0.07*  J 

= \.74e~0Ax.  The  values  of  y found  by  the  first  and  second 
formulas  for  several  of  the  indicated  values  have  been  tabulated 
as  follows: 


X 

Value  of  y 
by  first  formula 

Value  of  y 
by  second  formula 

1.25 

1.54 

1.54 

3.75 

1.22 

1.20 

-1 

2.01 

2.00 

-2 

2. 38 

2.12 

-3 

2.86 

2.35 

The  meaning  of  the  foregoing  calculation  is  this:  as  indicated 
in  the  text,  interpolation  rarely  leads  to  considerable  errors. 
In  our  case,  both  formulas  yield  close  results  for  x — 1.25  and 
for  x = 3.75.  Extrapolation,  on  the  contrary,  is  unreliable.  It 
is  evident  from  the  table  that  the  farther  x is  from  the  tabulated 
values,  the  bigger  the  difference  in  the  values  of  y obtained  by 
the  different  formulas.  Observe  that  for  x = —8  the  first  formula 
is  meaningless,  while  for  x < —8,  the  first  formula  yields  y < 0 
and  the  second  one  gives  y > 0. 

5.  (a)  ym  = 2.8;  (b)  ym  = 2.9. 


Chapter  3 

MORE  ON  INTEGRALS  AND  SERIES 


3.1  Improper  integrals 

In  the  usual  definition  of  an  integral  (see,  for 
example,  HM,  Sec.  2.8)  it  is  assumed  that  the 
interval  of  integration  is  finite  and  that  the 
integrand  function  does  not  vanish  on  the  in- 
terval. Such  integrals  are  termed  proper  inte- 
grals. If  at  least  one  of  these  two  conditions 
does  not  hold  true,  then  the  integral  is  said  to 
be  improper.  Such  integrals  occur  even  in  rather 
simple  problems  of  integral  calculus  (see,  for 
instance,  HM,  Secs.  3.16,  4.3,  6.2). 

We  first  consider  an  integral  of  the  form 
00 

I = $/(*)  dx  (1) 

a 

where  the  lower  limit  a and  the  integrand  f(x)  for  a ^ x < oo 
are  assumed  to  be  finite.  Such  an  integral  is  improper  because  the 
upper  limit  is  infinite.  We  say  it  has  a singularity  at  the  upper 
limit. 

Suppose  that  integral  (1)  was  obtained  in  the  solution  of  a physi- 
cal problem  and  the  variable  x has  a definite  physical  meaning 
(length,  time,  and  the  like).  Then  in  actuality  x does  not  go  to 
infinity  but  to  some  very  large  but  finite  limit,  which  we  denote 
by  N.  Thus,  instead  of  (1)  we  have  to  consider  the  integral 

N 

In  = ^ f(x)  (2) 

a 

It  may  turn  out  that  the  integral  (2)  does  depend  on  N but 
for  sufficiently  large  N remains  practically  unaltered.  Then  this 
value  is  taken  for  the  value  of  the  integral  (1).  To  be  more  exact, 
it  is  then  assumed  that 

oo  N 

dx  — lim  \f(x)  dx 

Ar->  oo  J 

a a % 


5 —1634 
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and  (1)  is  called  a convergent  integral.  From  this  limit  and  from  the 
equation 

00  N <30 

^/(*)  dx  = ^f(x)  dx  -f  ^f{x)  dx 

a a N 

it  is  evident  that  the  basic  contribution  to  a convergent  integral 

is  made  by  its  “finite”  (“proper”)  part,  whereas  the  contribution 
of  the  singularity,  for  sufficiently  large  N,  is  arbitrarily  small.  In 
other  words,  if  (1)  is  convergent,  then  the  “actual”  integral  (2), 
for  large  N,  which  is  often  not  known  exactly,  may  be  replaced 
by  the  “limit”  integral  (1),  which  ordinarily  is  simpler  in  theoretical 
investigations. 

If,  as  N increases,  the  value  of  the  integral  (1)  does  not  become 
steady  but  tends  to  infinity  or  oscillates  without  having  a definite 
limit,  then  (1)  is  said  to  be  a divergent  integral.  In  that  case,  the 
value  of  (2)  for  large  N depends  essentially  on  N,  and  (2)  cannot 
be  replaced  by  (1).  Then  the  question  can  arise  of  a more  detailed 
description  of  the  behaviour  of  (2)  as  N increases,  that  is,  of  obtain- 
ing asymptotic  formulas  for  the  integral.  (Incidentally,  such  a ques- 
tion also  arises  for  the  convergent  integrals  (1),  since  it  is  often  not 
enough  merely  to  establish  the  fact  of  convergence  or  divergence 
or  even  to  find  the  numerical  value  in  the  case  of  convergence, 
for  the  law  of  convergence  itself  may  be  needed.) 

From  the  foregoing  it  follows  that  the  fact  of  convergence  or 
divergence  of  the  integral  (1)  depends  only  on  the  behaviour  of  the 
function  f(x)  “at  the  singularity  of  the  integral”,  that  is  to  say,  as 
x ->  oo.  This  fact  is  most  often  perceived  by  comparing  f(x)  with 
the  power  function  C/xp,  the  integral  of  which  can  be  taken  with 
ease.  Let  us  consider  the  integral 


I = C — dx  ( C constant)  (3) 

Jxp 

x0 

where  x0  is  any  positive  number  (if  we  take  x0  negative,  then  the 
integral  can  have  a singularity  also  at  x — 0,  where  the  integrand 
becomes  infinite).  We  can  readily  evaluate  this  integral: 


00 

J Cx-p  dx  = 
*• 


-p  + 1 


c 

(p-  i)*ri 


-c 

(P  - 1)  **-*■  *. 

c 

(p  - 1)  oof-1 


(4) 


Two  cases  have  to  be  distinguished  here:  namely,  if  p~>  1,  then 
p — 1 > 0,  oop~l  = oo  and  the  last  term  on  the  right  of  (4)  is  equal 
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to  zero.  Hence,  in  this  case  the  integral  (3)  converges.  But  if  p < 1, 
then  oop_1  = — - — = 0,  and  for  this  reason  the  last  term  in  (4)  is 

C01-:P 

infinite,  which  means  that  in  this  case  (3)  diverges  to  infinity.  In  other 
words,  the  appropriate  integral  Is  taken  from  x0  to  N tends  to 
infinity  as  N increases.  We  obtain  an  expression  for  IN  if  N is 
substituted  into  the  right  member  of  (4)  in  place  of  oo  (in  other  cases 
too,  it  is  very  easy  to  obtain  an  expression  for  if  the  corresponding 
indefinite  integral  can  be  evaluated).  It  is  quite  evident  that  in  this 
expression,  for  large  N,  the  principal  term  for  p > 1 is  the  first  term 
and  for  p < 1 the  second  term. 

For  p = 1,  the  integral  (3)  is 


t — dx  = C In  # 


C In  oo  — C In  x0  = oo 


Again  the  integral  diverges  to  infinity.  Thus,  (3)  converges  when 
p > 1 and  diverges  when  p ^ 1. 

On  the  basis  of  this  result  we  can  conclude,  for  example,  that 
the  integral 


J Y*2  + i 


(5) 


which  has  a singularity  at  the  upper  limit,  diverges  to  infinity  since 
for  large  x the  integrand 


l _ l l 

\ x2  + 1 x2!3  fl  + x~2 


(6) 


is  asymptotically  equal  to  l/x2'3,  which  is  to  say,  in  this  case  p = 
= 2/3  < 1.  On  the  contrary,  the  integral 


\-T=dx 

j Y*3  + i 


(7) 


is  convergent  since  the  integrand  is  asymptotically  equal  to  l/*3/2 
as  x -*■  oo ; here,  p = 3/2  > 1.  The  integral 

00 

J e-*2  dx  (8) 

0 

is  also  convergent  since  the  integrand  tends  to  zero  faster  than  any 
power  of  x as  x oo.  In  all  these  examples,  the  corresponding  inde- 
finite integrals  are  not  expressible  in  terms  of  elementary  functions. 
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so  that  it  would  be  rather  difficult  to  establish  the  convergence  by 
evaluating  the  indefinite  integral. 

It  is  not  hard  to  obtain  asymptotic  expressions  of  the  last  three 
integrals  taken  from  0 to  N as  N increases.  The  following  device  is 
used  for  a divergent  integral  of  type  (1):  a function  fx{x)  is  selected 
whose  integral  is  easily  found,  and  such  that  it  is  asymptotically 
(as  x ->  oo)  nearly  equal  to  f(x) ; then  in  the  right  member  of 

N N N 

$/(*)  dx  = $/,(*)  dx  + $r/(*)  -fi(x)]  dx 


the  first  integral  (principal  term)  is  readily  investigated,  while  the 
second  integral  may  prove  to  be  convergent  as  N ->  oo,  or  the  same 
device  may  be  applied.  For  the  integral  (5)  it  is  natural  to  assume 
fx(x)  — x-2P,  that  is,  to  write 

N a N 

f f dx  r dx 

J y x2  +1  J V*2  + 1 J V*2  + 1 

0 0 a 


N 


N 


= C1  + [4=dx  + {[J?^=~-^=)dx 

31V*2  + i Y*2) 


a 

N N 

Here  we  passed  from  ^ to  ^ , where  a is  a positive  number  (although 

0 a 

this  is  not  obligatory  in  the  given  case),  in  order  to  avoid  the 

N 

improper  integral  ^x~2l*dx,  which  has  a singularity  at  the  lower 
o 

limit.  It  can  be  verified  that  as  N oo  the  last  integral  in  (9)  is 
convergent  and  for  this  reason  the  entire  expression  (9),  for  large  N, 
has  the  asymptotic  representation  2>fN  + C + an  infinitesimal, 
where  C is  a constant.  In  order  to  find  the  value  of  the  constant 
C,  we  have  to  take  advantage  of 


N 


dx 


yx2  + 1 


-3  fN 


for  some  N,  evaluating  the  integral  in  the  right  member  via  one  of 
the  formulas  of  numerical  integration. 
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In  similar  fashion  we  investigate  the  asymptotic  behaviour 
of  the  integrals  (7)  and  (8).  A useful  transformation  for  a convergent 
integral  is  often 

N oo  co 

J f(x)  dx  = ^ f{x)  dx  - j/(*)  dx 

a a N 

For  the  integral  (7)  we  get 


:Y  oo  oo 


0 N 


where  the  constant  D,  equal  to  the  value  of  (7),  can  be  computed  as 
C was  in  the  preceding  paragraph. 

Apply  integration  by  parts  to  (8) : 


^ e~xi  dx  = ^ e~x2  dx  — ^ e~*2  dx  = E -f  ^ ~ de~x 2 

6 0 N N 


= E - — e~N 2 + - [tXldxxE  - — e-*' 

2N  2 J x2  2 N 

N 


The  constant  E,  that  is,  the  value  of  the  integral  (8),  is  equal  to 
^7r/2,  as  we  shall  see  in  Sec.  4.7. 

By  way  of  another  illustration  we  consider  the  integral 

oo 

^sin  xdx  (10) 

o 


In  this  case  the  integral  over  a finite  interval  is 


N 


IN  = ^ sin  x dx  = — cos 


N 


= 1 — cos  N 


(N) 


As  N increases,  the  value  of  cos  N oscillates  and  does  not  have 
a definite  limit,  which  means  (10)  is  a divergent  integral  that  diverges 
in  oscillatory  fashion. 
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It  is  easy  to  verify  that  bringing  a declining  factor  e~a* 
(a  a constant  > 0)  under  the  integral  sign  in  (10)  leads  to  the  conver- 
gent integral 

ao 

^ e-«x  sin  x dx 
o 


We  can  also  prove  convergent  an  integral  of  a more  general  type, 


sin  x dx 


where  f(x)  is  any  decreasing  function  that  tends  to  zero  as  x oo. 

Improper  integrals  different  from  (1)  are  considered  in  a manner 
similar  to  (1).  For  example,  suppose  we  have  the  integral 

b 

J. f{x)dx  (12) 


for  which  the  limits  of  integration  are  finite  but  the  integrand  goes 
to  infinity  as  x ->  a,  that  is,  the  integral  has  a singularity  at  x = a. 
This  singularity  is  then  “cut  off”,  and  instead  of  (12)  we  consider 

b 

j f(x)dx  (13) 

0+E 

where  e is  a small  positive  number.  If  for  a sufficiently  small  z 
the  integral  (13)  for  all  practical  purposes  no  longer  depends  on  z, 
then  (12)  is  said  to  be  a convergent  integral  and  we  put 


b b 


In  this  case  we  can  pass  from  the  integral  (13)  (which  frequently 
appears  in  the  solution  of  physical  problems  since  all  physical  quan- 
tities are  finite)  to  the  simpler  integral  (12),  that  is  to  say,  we  ignore 
the  contribution  to  the  integral  (12)  of  the  singularity.  Now  if  the 
integral  (13)  depends  essentially  on  e for  small  e,  that  is,  if  it  does  not 
have  a finite  limit  as  z ->  0,  but  tends  to  infinity  or  oscillates  without 
a definite  limit,  then  (12)  is  called  a divergent  integral;  in  this  case 
we  cannot  pass  from  (13)  to  (12). 

The  convergence  or  divergence  of  an  improper  integral  of  type 
(12)  is  ordinarily  established  by  comparing  the  integrand  j(x)  with 

the  power  function , which  is  also  equal  to  infinity  at  x = a 

(x  - a)P 
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Fig.  15 


and  is  readily  integrable.  We  leave  it  for  the  reader  to  verify  that  the 
improper  integral 

a 

f — £ — fix  (C  constant)  (H) 

— a)P 
b 

converges  for  p < 1 and  diverges  for  p ^ 1 . 

To  illustrate,  let  us  consider  the  problem  of  the  flow  of  liquid 
from  a cylindrical  vessel,  in  the  bottom  of  which  is  an  opening  of 
area  a (Fig.  15).  The  level  h of  the  liquid  depends  on  the  time  t , h = 
= h(t).  If  the  liquid  is  not  viscous  and  we  can  disregard  the  forces 
of  surface  tension,  the  rate  v of  discharge  from  the  vessel  can,  to  a 
sufficient  degree  of  accuracy,  be  given  by  Torricelli's  law: 

v = |/2  gh 

Therefore  the  volume  discharged  in  time  dt  is 

av  dt  = G^lgh  dt 

On  the  other  hand,  the  same  volume  is  equal  to  — Sdh  (we  take 
into  account  that  h decreases  and  therefore  dh  < 0).  Equating  both 
expressions,  we  get 

o^Jlgh  dt  = — Sdh,  that  is,  dt  = 4=- 

a V 2 g yh 


In  order  to  obtain  the  total  discharge  time  we  have  to  integrate: 


5 

~oVTg 


0 

f dh 

s 

h'1* 

to 

i 

o 

H 

1 

<7 

2 H 
g 


T=  - 


(15) 
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Actually,  the  discharge  does  not  reach  h = 0 but  only  h = e, 
where  e is  a small  quantity  comparable  to  the  roughness  of  the 
bottom  or  to  the  thickness  of  wetting  film;  formula  (15)  ought  to 
be  rewritten  thus: 


j __ s__  r dh^ 

J yii 

H 


(16) 


However,  since  the  improper  integral  (15)  turned  out  convergent 
| this  was  evident  from  the  computations  of  (15);  what  is  more,  it 

is  an  integral  of  type  (14)  for  p = —j  > the  integral  (16)  maybe 

replaced  by  (15).  It  is  evident  here  that  e is  not  known  exactly  and 
also  that  it  is  not  essential  because  for  a convergent  integral  the  im- 
portant thing  to  know  is  that  £ is  small. 

In  numerical  integration  (cf.  Sec.  1.1)  improper  integrals  require 
particular  attention.  It  often  happens  that  the  given  integral  is  repre- 
sented in  the  form  of  the  sum  of  a proper  integral,  obtained  by  exclud- 
ing the  interval  about  the  singularity  from  the  interval  of  integra- 
tion, and  the  improper  integral  taken  over  the  interval  near  the  sin- 
gularity. The  former  is  found  numerically  and  the  latter  by  a series 
expansion,  or  the  integrand  is  merely  replaced  approximately  by 
some  other  function  (a  power  function,  for  example)  whose  integral 
can  be  readily  evaluated.  To  illustrate,  let  us  evaluate  the  integral 


dx 


/sin  * 


Here  the  integrand  increases  without  bound  as  # approaches 


0 and  as  x approaches  n.  We  partition  the  interval  of  inte- 
gration into  three  subintervals:  from  0 to  7t/6,  from  tc/6  to  5 r/6, 
and  from  57t/6  to  7r.  In  the  first  subinterval  we  can  assume  that 
sin  * ^ x,  since  * is  small.  Therefore 

tt/6  tc/6  „ * 

il=2J/^’'/6  = 2l/"=  1.45 

.1  ysin  x J /*  0 » 6 


A series  expansion  can  be  applied  for  greater  accuracy: 

L*= = c i,  - i -Ui  - ^ - . . r * 

J /sin  x )V*{  * ) 31  5!  j 

oo  o 

J ‘fi  + ^+^1  k«[2  + “2+^+...) 

)Yx[  12  160  ) \ 30  720  ) 


However,  for  the  accuracy  used  here  the  correction  is  negligible  (we  get  1.46).  The 
only  useful  thing  is  that  we  get  some  idea  of  the  degree  of  reliability  of  the  result. 
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In  the  third  subinterval,  5tt/6  < x < n,  we  take  advantage  of  the 
formula  sin  x = sin(n  — x),  and  since  the  quantity  it  — x is  small, 
we  can  put  sin(7t  — x) » - — x.  In  this  subinterval  we  have  sin  xk 
« 7:  — x.  We  obtain 

( =2l/f=l.« 

J ]!sm  x J Viz  ~ x Stz’6  1 & 

5rc/6  57r/6 

We  compute  the  integral  over  the  second  subinterval  using  Simpson's 
rule  and  dividing  the  subinterval  into  two  pieces  to  get 

5rr/6 


( -^=«  - [Ml  + 4.1  + 1.41]  = 2.38 
J Vs'-  - 


r/6 


Consequently 


c dx 
J Vsin  x 


1.45  + 2.38  + 1.45  = 5.28 


The  exact  value  of  this  integral,  to  two  decimal  places,  is  5.25. 

Exercise 


TT/2 


Evaluate  the  integral  f - dx  . carrying  out  the  calculations  to 


J ysin  x 
o 


within  slide-rule  accuracy. 


3.2  Integrating  rapidly  varying  functions 

When  integrating  numerically,  it  is  useful  to  be  able  to  estimate 
the  order  of  magnitude  of  the  integral  beforehand.  From  the  geome- 
tric meaning  of  the  integral 

b 

I ---  ^ y(x)  dx 

a 

there  follows  immediately  the  obvious  estimate 

b 

I ^ ymax  = Vmax  ip  #) 
a 

where  ymax  is  the  maximum  value  of  the  integrand  y(x)  on  the 
interval  of  integration.  If  this  function  is  positive  and  varies  but 
slightly  over  the  interval  of  integration,  we  can  take  y»ymax,  or 

I <y^{b- a)  (17) 

(this  estimate  occurred  in  HM,  Sec.  3.16). 
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It  is  well  to  observe  from  the  start  that  (17)  and  also  the  subse- 
quent estimates  of  this  section  are  inconvenient  in  case  of  alternat- 
ing functions  y{x).  For  alternating  functions  the  interval  of  integra- 
tion may  be  divided  into  several  parts  so  that  within  each  part  the 
sign  remains  unchanged;  then  an  estimate  is  made  of  the  integrals 
over  these  parts.  However,  the  overall  estimate  will  be  satisfactory 
only  if  the  contribution  of  integrals  of  one  sign  essentially  exceeds 
the  contribution  of  integrals  of  the  opposite  sign. 

For  this  reason,  throughout  this  section  we  will  regard  the 
integrand  as  positive  over  the  interval  of  integration. 

If  the  function  y{x)  over  the  interval  of  integration  is  positive 
but  decreases  rapidly  and  b is  comparatively  large,  then  the  esti- 
mate (17)  may  lead  to  substantial  errors.  This  is  so  because  when 
we  use  (17),  we  replace  the  function  with  its  maximum  value. 
But  if  the  function  varies  rapidly,  then  its  values  are  close  to  maximum 
only  over  a small  portion  of  the  region  of  integration.  To  illustrate 

b 

we  consider  I = ^e~xdx  (b  > 0).  The  maximum  value  of  the  inte- 
nt 

grand  over  the  integration  interval  is  obtained  at  x = 0.  It  is  equal 
to  3.  The  estimate  (17)  yields  I x b.  But  in  this  instance  it  is  easy 
to  obtain  an  exact  formula  for  the  integral:  7=1—  e'b.  Form  a 
table  of  the  exact  value  of  I as  a function  of  b : 

b 0 0.1  0.2  0.5  1 2 3 5 10 

7 0 0.095  0.18  0.39  0.63  0.86  0.95  0.993  0.99996 


It  is  clear  from  this  table  that  the  estimate  (17)  is  fairly  decent  as 
long  as  b is  small  (in  which  case  the  function  varies  but  slightly  over 
the  interval  of  integration).  But  if  b is  great,  then  the  approximation 
I&b  becomes  very  poor. 

Let  a function  y(x)  be  decreasing  rapidly  over  the  interval  of 
integration.  Then  the  maximum  value  of  the  function  is  attained  at 
the  left  endpoint  of  the  interval,  that  is,  at  x = a.  (Observe  that  this 

does  not  imply  the  equation  — ~ yf  (a)  = 0,  see  Fig.  16.)  Since 

dx  x=a 


b 


for  the  rapidly  decreasing  function  y{x)  the  integral  I\y  dx 


cannot 


substantially  vary  as  b increases,  a rough  estimate  of  the  integral  7 
should  not  include  b ; we  assume,  as  it  were,  b = oo.  It  is  natural 
to  assume  that  in  this  case  the  integral  is  roughly  equal  to  the  product 
of  ym ax  = y(a)  into  the  length  A#  (independent  of  b)  of  the  integra- 
tion interval.  For  a given  y(a),  this  length  should  be  the  smaller,  the 
faster  the  function  is  decreasing,  that  is  to  say,  the  greater  !/(«)!• 
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Fig.  16 


The  quantity  Ax  having  the  same  dimensions  as  x can  be  constructed 
in  unique  fashion  by  proceeding  from  y(a)  and  | y'(a)  j: 

Ax  = y ^ - m 
i y'(<*)  I 

where  m is  a nondimensional  constant  of  proportionality.  We  then  get 

I » v(a)  Ax  = ■ y — m (18) 

I ?(a)  I 

If  the  function  y(x)  is  increasing  over  the  interval  of  integration, 
then  it  reaches  a maximum  at  the  right  endpoint  of  the  interval, 
that  is,  at  x = b.  Then  (18)  becomes 


A typical  example  of  a rapidly  varying  function  is  the  exponential 
function  y = Ce~kx  (C  > 0,  k > 0).  Choose  a value  m in  (18)  so  that 


(18)  is  absolutely  exact  for  ^ Ce~kx  dx. 


Since  y = Ce~kx , y9  — — kCe~kx,  it  follows  that  y(a)  = Ce~ko 
and  | y'(a)  \ = kCe~kay  and  so  (18)  yields 


I — m- 


Ce~ka  m . . 

m-—r=-y{a) 
r h 


The  exact  value  of  the  integral  at  hand  is 


I = \ Ce~kx  dx  = --e- 
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Comparing  results,  we  get  — y(a)  = —y(a),  whence  m=  1.  And  so 

k k 

the  formula  (18)  takes  the  form 


I 


y2(«) 

I y'W  l 


(19) 


Generally  speaking,  this  formula  is  not  exact  in  the  case  of  an  infinite 
interval  for  a rapidly  varying  function  of  a different  type  and  also 
in  the  case  of  a finite  interval,  but  it  yields  fairly  decent  results. 

In  order  to  make  the  meaning  of  (19)  pictorial,  draw  a tangent 
to  the  curve  y = y(x)  at  the  point  A (Fig.  16)  and  find  the  length 
of  AXN . The  equation  of  the  tangent  is  y — y(a)  = y'(a)  (x  — a). 
Setting  y = 0,  we  get  the  point  of  intersection  of  the  tangent  and 

the  #-axis.  This  produces  x = a — — - or,  noting  that  y'(a)  < 0 

y'(a) 

since  y(x)  is  a decreasing  function,  we  get  x = a -| — — — , and  so 

\y’(a)\ 

AXN  = a+  — a = = Ax 

I y'M  I I /(«)  i 


The  estimate  of  the  integral  by  means  of  (19)  amounts  to  replacing 
the  area  under  the  curve  y = y(x)  by  the  area  of  the  rectangle  in 
Fig.  16. 

CO 

Sd  X 

— • Here, 

X8 

1 

y = \>  / = — -,»  and  so  3/(1)  = 1,  | >•' (1)|  = 8, 

X xv 

f—  » - = 0.125 
J xs  8 
1 

oo 

The  exact  value  of  this  integral  is  I = [ — = — — = — = 0.  H3. 

& J A'8  7*7 1 i 7 

1 

The  error  comes  to  13%. 

Another  frequently  occurring  type  of  integral  is  that  in  which 
the  integrand  y(x)  reaches  a maximum  at  x = xm  somewhere  inside 
the  interval  of  integration  (Fig.  17a).  Herey'(^)  = 0 at  the  maximum. 
We  can  split  the  integral  into  two : one  from  a to  xm  and  the  other 
from  xm  to  b.  Then  in  each  of  them  the  integrand  reaches  a maximum 
at  an  endpoint  of  the  respective  interval.  It  might  appear  that  this 
problem  then  reduces  to  the  preceding  one.  Actually  this  is  not  so 
because  splitting  the  integral  into  two  gives  nothing  since  at  x = xm 
we  still  have  y'  = 0 and  the  estimate  (19)  is  inapplicable.  Thus,  this 
is  indeed  a new  case  and  we  have  to  find  some  other  way  to  isolate 
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Fig.  17 


the  necessary  part  Ax  from  the  interval  of  integration.  The  idea  here 
is  that  the  quantity  A#  is  determined  by  the  value  of  y"(xm),  that 
ts,  it  is  determined  by  the  magnitude  of  the  curvature  at  the  maximum 
point.  From  the  figure  it  is  clear  that  the  steeper  the  curve,  the  smaller 
Ax  must  be.  The  dimension  of  the  second  derivative  coincides  with 

that  of  the  quantity  — > and  so  a quantity  of  the  same  dimension 

x2 

as  Ax  is  obtained  from  the  quantities  y(xm)  and  y"(xj  thus: 


A*  = l 


yi*m)  * 
I y"{xm)  I 


* 


We  can  also  arrive  at  an  expression  of  this  type  for  Ax  in  the  following  manner: 
expand  y(x)  in  a Taylor  series  near  the  maximum,  that  is,  in  powers  of  x — xm  , 
and  take  the  first  two  terms  that  do  not  vanish.  We  get  y(x)  = y(xm)  + 

H (x  — xm)2  • y"(xm).  Then  find  the  value  of  the  difference  x — xm  = Ax 

2 

at  which  this  approximate  expression  for  y(x)  vanishes: 

y(*m)  + -J  (X  — Xm)2  ■ y”(xm)  = 0 

whence 


Ax 


~ x xm  — 

iingly, 

, + A* 

U 


2y(xm) 

' y"(xm) ' 


fl 


2y[*m) 

/'(Xm)  I 


Accordingly,  the  approximate  value  of  I is 
xm  + Ax 

1 


y(xm)  + — (x  - xm)2  y"(xm) 


*m-A* 


dx=  — 

|f  3 


y(xm)  &x 


32y3(y,n) 

91  /'(Xm)  I 
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(We  write  \y"(xm)\  instead  of  y"(xm)  because  y"(xm)  < 0,  since 
for  x = xm  the  function  y(x)  has  a maximum.)  The  quantity  / is  a 
dimensionless  coefficient.  For  the  integral  we  have  the  estimate 


y3(%m) 


I i <20> 

We  determine  the  value  of  the  coefficient  l from  the  condition  that 

+ 00 

formula  (20)  be  absolutely  exact  for  ^ y dx,  where 

— oo 

y(x)  = Ce-kxl,  k > 0,  C >0 

(The  graph  of  the  function  y = Ce~kxi  for  the  case  C = 3,  k —0.5 
is  shown  in  Fig.  176). 

Here,  xm  = 0,  y(xm)  = C,  y"{xm)  = —2 Ck.  By  (20)  we  get 


/ = / 


i 


c3 


1C 


2 Ck  V2k 


(21) 


+ oo 


To  find  the  exact  value  of  the  integral  ^ Ce~kx 2 dx  make  the 

— oo 

change  of  variable  z = x^k,  dz  = dx.  We  then  get 

+ 00  +oo 

J Ce-k*'  dx  = J e-*’-  dz 
— 00  — 00 

03 

The  value  of  the  integral  ^ e~zi  dz  may  be  found  exactly:  in  Sec.  4.7 

— 03 

we  will  show  that  this  integral  is  equal  to  f re.  Therefore 

+-  . — 

J Ce-kxidx  = c]j^ 

— oo 

Comparing  this  formula  with  (21),  we  get 

JL  = c\[- 

V2k  r k 

whence  l = /2re.  Formula  (20)  takes  the  form 

7 + 


2re 


y*(xm) 

I y"i.xm)  I 


(22) 
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Thus,  for  the  types  of  rapidly  varying  functions  we  obtained 
two  formulas,  (19)  and  (20),  and  the  coefficients  in  these  formulas 
are  chosen  so  that  the  formulas  are  exact  for  integrals  of  typical  func- 

oo 

tions  over  an  infinite  interval:  (19)  is  exact  for  ^ Ce~kx  dx  and  (22) 

a 

+ « 

is  exact  for  ^ Ce~kx%  dx . For  a different  choice  of  typical  functions 

— CJO 

the  formulas  would  contain  different  coefficients.  For  instance,  let 
us  determine  the  value  of  the  coefficient  l from  the  condition  that 

+ oo 

formula  (20)  be  exact  for  I — [ — - — dx,  C > 0,  k > 0.  (The  reader 

J 1 + kx 2 
— 00 

c 

will  readily  see  that  the  function  y = has  a maximum  at 

1 + kx2 

*„  = o.)  Since  y = y"  = ickfri^‘’  m eives 

/ = l]f—  = 

\ 2 Ck  V2k 

To  compute  the  integral  exactly,  set  f kx  = z,  dz  = ^ k dx\  then 

+ 00 

j C f dz  Ctz 

~ (/*  J 1 +z*~  fk 

— CO 


Therefore  > whence  l = tcV  2. 

yik  Yh 


formula 


In  this  case  we  get  the 


/ 


7T 


1/ 

If  l I 


However,  preference  is  given  precisely  to  formulas  (19)  and  (22). 
The  reason  is  this.  It  turns  out  that  if  a rapidly  varying  function  is 
obtained  by  raising  to  the  power  n some  given  function  f(x)  for  which 
| f(x)  | < f(xm)  when  x =f=  xm , then  for  sufficiently  large  n the  rela- 
tive error  of  (19)  and  (22)  becomes  arbitrarily  small. 

We  illustrate  with  a couple  of  examples. 

1.  We  will  integrate  successive  powers  of  the  function  u = — - — 

l + * 

from  x = 0 to  x = oo.  Set  un  = un  = — These  are  rapidly 

(1  4-  x)n  * J 

varying  functions  of  the  first  type  with  maximum  at  an  endpoint, 
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and  the  greater  the  exponent  n,  the  steeper  the  function  un  falls  as  x 

co 

increases  from  0.  We  find  I M = C — — — with  the  aid  of  (19). 

3 (1  + *)n 
0 

Since  u'n  = — — — > it  follows  that  I u’J 0)  | = n and  so  the 
(l  + *)n+1 

approximate  value  of  this  integral  is 

/(*)  =-  A 


The  exact  value  of  the  integral  is 

oo 

dx  1 


/(») 
J ex 


-J 


1 


(1  + x)n  -n  + 1 (1  + x)71-1 


n ~ 1 


The  ratio 


t(») 

Jap 


n — 1 


r(n) 


Since  the  fraction 


n — 1 


gets  closer  to  unity  as  n increases,  we  have 


T(n) 

Jap 


1 


for  very  large  values  of  n,  which  is  what  we  asserted.  If  in  (18)  we 
chose  m =£  1,  the  value  of  m would  have  remained  in  the  right-hand 
member  of  the  last  formula,  that  is,  for  large  n we  would  have  a 
systematic  error. 

2.  Suppose  £ = r with  — oo  < # < oo.  We  form  the  ra- 


l + *2 


pidly  varying  functions  zn  = zn  = ■ 


1 


••  These  are  functions  of 


(1  + x2)n 

the  second  kind  with  maximum  inside  the  region  (here,  xm  = 0). 

+ °°  . — 

We  approximate  /(M>  = C — — using  (22)  and  get  /ip1  = ]/  — • 

J (1  + x2)n  If  n 

— oo 

The  integral  Hn)  can  be  computed  exactly: 

/(;>  = 1 3 ' 5 - * 

2.4.6...  (2«  — 2) 


Prove  this  by  integrating  the  equation 
T x 1'  _ 2w  — 3 2n  — 2 

L (l  + *2)B-1J  ~ (l  + xi)n-1+  (l  + *2)" 


Sec.  3.2 

Integrating  varying  functions 

81 

The  ratio 

4 V_  2 • 4 • 6 ...  (2n  — 2) 

45*  _ V tw  1 - 3 ■ 5 ...  (2n  — 3) 

(23) 

Let  us  compute  the  values  of  the  right  member  of  (23)  for  several 
values  of  n.  For  n — 2 we  get  0.797.  For  n = 4 we  have  0.904;  for 
n — 6 we  have  0.938 ; for  n ~ 8 we  get  0.962 ; and  finally  for  n = 10 
we  obtain  0.978.  Thus,  as  n increases,  the  value  of  the  right  member 
of  (23)  approaches  1 *. 

We  now  pass  to  the  general  case.  Let  us  consider  y = [f(x)]n  where 
the  function  f(x)  > 0 has  a unique  maximum  point.  Let  In  f(x)  = <?(*), 
then  f(x)  ~ e*lxK  Therefore 

y = (24) 

b 

The  integral  I ~^y  dx  is  mainly  determined  by  the  values  of 

a 

the  integrand  in  the  region  where  y does  not  differ  substantially  from 
its  maximum  value. 

It  is  clear  that  y attains  a maximum  at  the  same  time  as  y(x), 
that  is,  for  one  and  the  same  value  ^ = xm.  For  y to  decrease  e times 
as  compared  with  ymax,  it  is  necessary  that  ny(x)  be  less  than 
ny(xm)  b}/  unity,  that  is,  that  ncp(x)  = n<p(xm)  — - 1,  whence 

cp(x)  = <p(xm)  — - 


The  greater  n,  the  less  <p(x)  differs  from  <p(xm),  and  so  the  greater 
n,  the  more  exact  is  the  replacement  of  cp(x)  by  the  first  two  nonzero 
terms  in  the  Taylor  expansion.  Writing  down  the  two  terms  of  the 
expansion,  we  get  <p(#)  = 9 (#m)  + (x  — xm)  <?'(xm)  (for  a function  hav- 
ing a maximum  at  an  endpoint)  or  <p(x)  = <p( x m)  + ~ (x  — %m)2  ?"{x?n) 

(for  a function  with  a maximum  inside  the  interval). 

In  the  former  case,  using  (24)  we  get 

where  A = b = — n(p'(xm)  = ^|  ?'(*»,)  |. 

In  the  latter  case  we  obtain 


y = 


nv(*m)+Trn(x-xm)2‘  9 "(*m) 

e 2 


Ae-c^-Xm)1 


where 


c = - \ = Y n\  <p"(xj  |. 


This  can  be  proved  in  a more  rigorous  fashion  with  the  aid  of  Stirling's  formula 
(see  the  exercise  in  Sec.  3.3). 


6 — 1634 
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Thus,  the  function  y(x)  may  be  approximately  replaced  with  either 
a function  for  which  (19)  is  absolutely  exact  or  a function  for  which 
(22)  is  absolutely  exact,  the  replacement  being  the  more  exact,  the 
greater  n is.  . 

For  the  sake  of  simplicity  we  assume  that  xm  = 0.  Then  in  order 
to  construct  a rapidly  varying  function  we  could  go  over  to  f(nx) 
instead  of  [/(*)]”.  But  in  that  case,  both  members  of  (19),  or,  respectiv- 
ely, (22),  are  merely  divided  by  n , that  is  to  say,  the  relative  error 
of  both  formulas  does  not  change.  This  is  due  to  the  fact  that  in  the 
indicated  transition  the  relative  portion  of  the  contribution  to  the 
integral  of  large  and  small  values  of  the  function  f(x)  remains 
unchanged  (whereas  when  we  pass  from  f(x)  to  [/(*)] n,  the  portion 
of  small  values  tends  to  zero  with  increasing  n). 

In  conclusion,  we  recommend  Approximate  Methods  in  Quantum 
Mechanics  by  Migdal  and  Krainov  [11],  which  contains  a variety  of 
methods  for  estimating  integrals  and  other  mathematical  expressions. 


Exercises 


i. 


2. 


3. 


Find  the 


integral  1 =\j^ 
6 

3 oo 


by  splitting  it  into  a sum  of 


two  integrals : I = C xdx  + C x dx-  • Evaluate  the  first  inte- 

3 1 + J 1 + 

0 3 

gral  by  Simpson's  rule,  the  second  integral  by  the  formulas  of 
this  section.  Carry  the  computations  to  within  slide-rule  accu- 
racy. 


Evaluate  the  integral  J = ^ ]f  xe~xZ  dx  by  splitting  it  into  a sum  of 

o 


two  integrals :/  = xe~xi  dx  + xe~*2  dx.  Find  the  first  in te- 

0 a 

gral  using  the  formulas  of  Sec.  1.1,  the  second  using  the  formulas 
of  Sec.  3.2.  Consider  the  cases  a = 1 and  a = 2.  Carry  the 
calculations  to  three  decimal  places. 

CO 

Estimate  the  integral  ^ e~x2  dx  considered  on  p.  69. 

N 


3.3  Stirling’s  formula 

As  an  interesting  example  in  the  use  of  the  formulas  of  Sec.  3.2 
we  obtain  a convenient  formula  for  approximating  n!  —n(n — 1)  X 
X (n  — 2)  ...3-  2 - 1 for  large  n's.  This  formula  will  come  in  handy 


Sec.  3.3 


Stirling’s  formula 
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Fig.  18 


in  the  study  of  probability  theory.  By  means  of  integration  by  parts 
it  is  easy  to  establish  (see,  for  example,  HM,  Sec.  4.3)  that  the 
following  equation  holds  true  for  n a positive  integer: 

xne~*  dx  = nl 

0 

Let  us  estimate  this  integral  by  the  method  of  the  preceding 
section. 

In  our  case  y = xne~x,  y ' = ( nx — xn)  e~x . Equating  the  first 
derivative  to  zero,  we  get  two  values : x — 0 and  x = n.  It  is  easy  to 
see  that  the  function  y(x)  has  a maximum  at  x = n and  is  zero  at 
x = 0.  (Fig.  18  depicts  the  graphs  of  y = xne~x  for  n = 3 and 
n = 4.)  Thus,  to  evaluate  the  integral  we  have  to  consider  the 
region  of  the  maximum  of  the  function,  x = n}  which  means  using 
formula  (22). 

To  find  y",  we  get  y"  ~ [n(n  — 1)  xn~2  — 2 nx”*1  + xn]  e~x  and 
so  yn{n)  — — nn~xe~n  L|~jn  . Therefore 

Thus, 

«!  « )” 

This  is  Stirling’s  formula.  Its  relative  error  tends  to  zero  with 
increasing  n j^this  follows  from  the  reasoning  of  Sec.  3.2  if  we  represent 
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x*e~x  as  ft"  e ” )')  • But  even  for  small  n it  yields  very  good 
results.  For  example, 


n — 1,  n\  — 1, 
n — 2,  n \ ~ 2, 
n ==  3,  n ! = 6, 
n = 4,  n!  — 24, 
n — 5,  n ! — 120, 


= 0.92,  error  8%; 
^471 J2  = 1.92,  error  4%; 
|/67r|— j3  = 5.84,  error  2.7%; 
1/8^ )4  =23.5,  error  2.1%; 
fTo^f-j5  =118  , error  1.7% 


Exercise 


Prove  that  the  ratio  (23)  tends  to  1 as  n ->  oo. 

Hint.  Multiply  the  numerator  and  the  denominator  of  the  fraction 
by  the  numerator. 


3.4  Integrating  rapidly  oscillating  functions 

In  the  study  of  rapidly  oscillating  actions  on  physical  systems  one 
had  to  consider  integrals  of  rapidly  oscillating  functions,  that  is,  func- 
tions which  change  sign  many  times  over  a finite  interval  of  integra- 
tion. Such  integrals  have  peculiarities. 

Suppose  we  have  to  compute  the  integral 

b 

I = ^F(x)  dx  (25) 

a 

where  the  graph  of  F(x)  has  the  form  shown  in  Fig.  19.  At  first  we  as- 
sume to  (frequency)  to  be  constant  and  the  amplitude  to  be  varying 
in  accordance  with  the  law  y = f(x) ; in  other  words,  we  assume 
the  integral  (25)  to  have  the  form 

b 

I = f f(%)  sin(a)#  + a)  dx  (26) 


where  co  is  great. 

From  Fig.  19  it  is  evident  that  (26)  is  small  for  large  co  since 
the  positive  part  is  almost  neutralized  by  the  negative  part.  To 
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Fig.  19 


obtain  a more  accurate  estimate,  perform  an  integration  by  parts 
to  get 

I — — \J{a)  cos (coa  -f-  a)  — f(b)  cos (<o6  + a)] 

CO 

b 

+ — ^/'  ( x ) cos(co#  + a)  (27) 

a 

This  integral  has  the  form  of  (26)  and  so  the  entire  last  term,  for  large 
<o,  is  of  higher  order  than  1/co.  Dropping  this  term,  we  get  the  approxi- 
mate formula 

I « — [/(a)  cos  (<o a + a)  — f{b)  cos  (co b + a)]  (28) 

CO 

This  result  can  also  be  expressed  in  terms  of  the  integrand  F(x)  of  the 
original  integral  (25) : since 

F'(x)  = f'(x)  sin(oiX  + a)  -f-  <*f(x)  cos(to#  + a)  « <o f(x)  cos(u>x  + a) 
we  can  also  write 

I * \[F'(a)  - F’(b )]  = - ~F'(x)  6 (29) 

or  CO^  a 

For  more  precision,  we  can  integrate  by  parts  once  more  in  the  right 
member  of  (27) ; and  after  dropping  the  resulting  integral  we  arrive 
at  the  approximate  formula 

/ * — [/(a)  cos  (u>a  + a)  — f(b)  cos  (a>b  + a)] 

CO 

+ -.[/'(*)  sin  (“&  + «)  — /'(«)  sin  (w«  + a)]  (30) 
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Here  too  we  can  write  a formula  similar  to  (29).  To  do  this,  from 
expressions  for  F'(x)  and  F'"(x)  compute  f(x)  cos  (ox  — a)  and 

— f'(x)  sin(w^  + a)  to  within  terms  of  the  order  of  1/to2  and  substi- 

CO 

tute  the  result  into  (30).  We  leave  the  manipulations  to  the  reader 
and  write  down  the  final  formula: 

I~-\\lF'{x)+\F>"(x)t 

C02[  CO2  \\a 

Like  (30),  this  formula  is  true  to  within  terms  of  the  order  of  — • 

CO3 

Using  the  foregoing  procedure  it  is  possible  to  refine  the  asympto- 
tic formulas  for  /,  but  the  resulting  formulas  will  be  progressively  more 
unwieldy  and  inconvenient  for  practical  use.  In  practice,  the  most 
frequently  used  formulas  are  (28)  and  (29).  It  is  interesting  to  note 
that  in  all  these  formulas  the  values  of  the  functions  / or  F and  their 
derivatives  participate  only  at  the  endpoints  of  the  interval  of  integra- 
tion. Incidentally,  this  is  in  accord  with  the  approximate  formulas 
for  alternating  sums  obtained  in  Sec.  1.2. 

Note  that  in  the  foregoing  formulas  we  had  to  assume  that  the 
function  f(x)  and  its  derivatives  of  the  orders  under  consideration  are 
continuous  within  the  interval  of  integration.  If f(x)  has  a finite  jump 
there  at  x = c,  then  we  have  to  pass  to  the  sum  of  integrals  from  a to 
o and  from  c to  5,  and  only  then  apply  the  indicated  transformations 
to  these  integrals,  as  a result  of  which  the  point  x = c will  make  its 
contribution  to  the  asymptotic  formulas.  This  contribution  is  parti- 
cularly substantial  in  the  important  case  where  a = — oo,  b = oo  and 
f(x)  together  with  all  its  derivatives  vanishes  at  x = + oo,  for  then 
the  right  sides  of  (28)  and  (30)  are  zero.  This  question  will  be  dealt 
with  in  more  detail  in  Sec.  14.4. 

b 

As  an  example  we  consider  the  integral  I = sin  ux  dx.  Its 

o 

exact  value  is 

CO  e~b 

Iez  (sin  6)6  + <o  cos  co5) 

1 + CO2  1 + CO2 

Formula  (28)  yields  the  approximate  value 
/(28)  = -1  (1  — e~b  cos  6)5) 

CO 

and  (29)  the  value 

1(29)  ==  — [6)  — £~&(6)  COS  6)5  — sin  6)5)] 

CO2 

Both  approximate  formulas  have  an  error  of  the  order  of  1/co*. 
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Fig.  20 


Now  consider  a case  where  the  rapidly  oscillating  function  F{x) 
under  the  sign  of  the  integral  (25)  has  a graph  as  shown  in  Fig.  20 ; 
that  is,  it  varies  in  both  amplitude  and  frequency.  In  this  case  the 
integral  (25)  can  often  be  written  in  the  form 

b 

I = if(x)  sin  (co<p(#)  + a)  dx  (31) 


where  <p(x)  is  an  increasing  (but  not  at  a constant  rate!)  function. 

The  integral  (31)  can  be  reduced  to  (26)  by  means  of  a change  of 
variable:  <p(#)  =5.  If  we  denote  the  inverse  function  by  x = g(s), 
we  get 

<P  (b) 

I = ^ f(g(s))g'(s)  sin  (cos  + a)  ds 

<p(a) 

Using  the  approximate  formula  (28)  gives 
Ita  — - [/(£(*))  g’(s)  cos  (cos  + a)]  I*'6’ 

6>  Js*=cp(fl) 

= — - \^—  cos  (<0<p(*)  + a) 

(here  we  made  use  of  the  formula  gr(s)  — lfc?'(x)  for  the  derivative  of 
the  inverse  function).  If  c p(x)  = x,  then  (32)  becomes  (28).  The  modi- 
fied formula  (29),  which  we  suggest  that  the  reader  derive,  becomes 


<02[<p'M]2  \* 

Exercises 


1.  Apply  (28),  (29)  and  (30)  to  the  integral  C COS(*x  dx. 

J 1 + X2 


2.  Write  an  analogue  of  formula  (30)  for  the  integral  (31). 
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2 

3.  Evaluate  the  integral  I — ^ sm  dx  at  co  = 1 and  co  = 10, 

i 

using  the  approximate  formulas  (28),  (29)  and  (30).  Compare  the 

results  with  the  exact  values  taken  from  tables  of  the  sine  integral. 

3.5  Numerical  series 

A numerical  series  is  an  “infinite  sum"  of  numbers 

al  + a2  + a3  rb  •••  + an  + an+l  + ...  * (33) 

This  is  not  a real  sum,  of  course,  since  only  a finite  number  of  numbers 
can  be  added.  It  is  a sum  with  a singularity,  similar  to  the  singularities 
of  improper  integrals  considered  in  Sec.  3.1.  For  this  reason,  the  ap- 
proach to  the  concept  of  the  sum  of  the  series  (33)  is  similar  to  that 
applied  in  Sec.  3.1 ; namely,  first  the  singularity  is  cut  off,  as  it  were, 
and  one  considers  the  partial  sums  of  the  series  (33) : 

S1~  av  Sz~  a1~\-  a2,  S3  = ax  -f-  a2  + a3  ..., 

Sn  = al  + a2  + a3  + - + an  (34) 

Now  if  we  increase  n so  as  to  “exhaust  the  singularity"  and  watch 
the  behaviour  of  the  partial  sum  (34),  two  possible  cases  emerge. 

(1)  The  partial  sum  may  approach  a definite  finite  limit  S so  that 
for  large  n it  is  practically  equal  to  S.  In  this  case  (33)  is  said  to  be 
a convergent  series , and  the  sum  of  the  series  is  set  equal  to  S.  Thus,  in 
the  case  of  convergence  we  can  go  from  a partial  sum  with  a large  n 
to  the  complete  sum  of  the  series  and  conversely,  that  is,  the  contri- 
bution of  the  singularity  to  the  total  sum  of  the  series  is  not  essential 
and  is  arbitrarily  small  for  large  n. 

(2)  The  partial  sum  may  tend  to  infinity  or  may  oscillate  without 
having  a definite  limit.  In  this  case,  (33)  is  said  to  be  a divergent 
series. 

The  simplest  example  of  a convergent  series  is  given  by  the  sum 
of  an  infinite  decreasing  geometric  progression: 

a + aq  + aq2  + ...  + + uqn  + •••  (|?l  < 1)  (33) 

Here,  as  we  know. 


and  since  the  power  qn  can  be  neglected  for  large  n,  the  sum  of  the  series 
(35)  is,  in  the  limit, 

S = lim  S„  = -5- 
*-•  l -q 


An  elementary  exposition  of  series  is  given  in  HM,  Sec.  3.17.  Here  we  continue 
the  discussion,  which  makes  use  of  the  results  of  Sec.  1.2. 
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The  series  1 + 1 + 1 + ...  is  an  instance  of  a series  diverging  to  infi- 
nity, and  the  series  1 — 1 + 1 — 1 + 1 — 1+...  is  an  instance  of 
a series  diverging  in  oscillatory  fashion,  since  its  partial  sums  are  equal 
successively  to  1,  0,  1,0,  1,  ...  and  do  not  have  a definite  limit. 

Since  in  the  case  of  a convergent  series  the  partial  sums  with 
large  n are  nearly  the  same,  terms  with  large  n are  nearly  equal  to 
zero : to  be  more  precise,  if  the  series  (33)  converges,  then  its  “general 
term  ” an  tends  to  zero  as  n increases.  However,  in  the  case  of  a diver- 
gent series  the  general  term  may  tend  to  zero  too.  For  example,  the 
divergent  series 


1+w+w+ 


+ 7-  + 
yn 


demonstrated  thus 

1 =n±  = )f 


+ 


n- 


S„  = 

►oo). 

X)  / 


(The  divergence  of  this  series  can  be 

1 -f-  _L  + _L  _|_  ...  _j_  _L>  -L  + 4=  + ...  , __ 
y 2 y 3 y»  yn  \n  1 yn  rvyn 

Hence,  this  test  alone  is  not  enough  to  establish  the  convergence 
of  a series.  Nevertheless,  if  the  dependence  of  the  general  term  an 
on  n is  known  and  is  not  very  complicated,  then  the  convergence 
or  divergence  of  (33)  can  ordinarily  be  decided  on  the  basis  of  other 
tests  which  we  give  below.  But  if  it  is  not  possible  to  establish  such 
a simple  relationship,  one  merely  computes  the  terms  in  succession 
and  if  they  do  not  go  beyond  the  limits  of  the  accepted  accuracy  and 
there  are  no  grounds  to  expect  subsequent  terms  to  make  substantial 
contributions,  then  all  subsequent  terms  are  dropped,  the  series  is 
stated  to  be  convergent  and  its  sum  is  regarded  as  equal  to  the  partial 
sum  of  the  computed  terms. 

The  first  test  for  the  convergence  of  series  (33)  is  what  is  known 
as  d*  Alembert'  $ ratio  test , which  is  based  on  an  analogy  with  the  sum 
of  the  infinite  geometric  progression  (35).  For  the  “pure'’  progression 
(35),  the  ratio  of  every  term  to  the  preceding  one  is  a constant  (it  is 
equal  to  the  common  ratio  q of  the  progression).  Now  suppose  that  for 
(33)  the  ratio 

an+i 


an 


of  a term  to  the  preceding  term  is  no  longer  a constant  but  tends  to 
some  limit  q as  n increases.  Then  for  large  n this  ratio  is  approximately 

equal  to  q and  (33),  like  (35),  converges  if  \q\  — lim 


an+ 1 


an 


< i ; (33) 

i 

diverges  if  ] q \ > 1.  And  only  when  \q\  = 1 is  it  impossible  to  esta- 
blish, via  the  ratio  test,  whether  (33)  converges  or  not,  so  other  tests 
have  to  be  invoked. 

Consider  the  series 
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Applying  the  ratio  test,  we  get 


lim  -2±*  = lim  : — = lim 

n->co  &n  »-mx>  L(w  “f*  1)^  J n->  oo 


K)' 


Thus,  for  | a | < 1 the  series  (36)  converges  like  a geometric  progression, 
for  \a  | > 1,  it  diverges;  when  « = ± 1,  the  ratio  test  cannot  be 
applied  to  (36). 

Let  us  now  consider  a stronger  test  called  Cauchy's  integral  test 
for  convergence  that  is  applicable  to  a series  with  positive  terms. 
Suppose  we  know  the  expression  of  the  general  term  an  of  (33)  as 
a function  of  n,  that  is,  an  — f{n ),  and  the  function  f(n)  is  positive 
and  decreasing  with  increasing  n.  Then,  by  virtue  of  formula  (9) 
of  Ch.  1,  we  can  take 

$n  = ai  + a2  + •••  + an  — /0)  +/(2)  + •••  +/(«) 


n 

\f(x)dX  + \m  + \f(n) 


for  the  approximate  value  of  the  partial  sum  (34).  Thus,  if  the  integral 


converges,  then  the  right  side  of  (37)  remains  finite  as  n oo,  that  is, 
the  series  (33)  converges.  But  if  the  integral  (38)  diverges  to  infinity, 
then  the  series  (33)  diverges. 

We  consider,  for  example,  the  series 

- + - + - + (39) 

\v  2v  2>v  np 

which  is  a consequence  of  (36)  for  a — 1,  when  d'Alembert's  ratio 
test  fails.  To  apply  Cauchy's  integral  test  we  have  to  consider  the 
integral 


This  integral  was  considered  in  Sec.  3.1  (formula  (3)),  where  we  knew 
that  it  converges  forp>  1 and  diverges  for  p ^ L Hence,  the  series 
(39)  too  converges  for  p > 1 and  diverges  for  p ^ 1.  In  particular, 
when  p = 1 we  get  the  so-called  harmonic  series 


1 +1  + 1 + ... 

2 3 
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In  the  case  of  series  with  terms  of  arbitrary  sign,  frequent  use  is 
made  of  the  Leibniz  test , according  to  which  the  series 

ai  — #2  ^3  — ^4  + ^5  — aQ  + (40) 

(all  the  at  are  considered  to  be  positive  so  that  two  adjacent  terms 
are  of  opposite  sign)  converges  if 

ax  > a2  > a3  > ...  > an  >...->  0 (41) 

Indeed,  if  (Fig.  21)  we  depict  on  an  auxiliary  axis  the  partial  sums 
(32),  then  from  the  condition  (41)  it  follows  that  the  transition  from 
S1  to  S2  and  from  S2  to  S3,  from  S3  to  S4  and  so  forth  is  in  the  form 
of  damped  oscillations,  that  is,  the  partial  sums  tend  to  a definite 
limit. 

Thus,  for  example,  the  series 


1 - 1+1-1+ 

2v  2>p 


converges  for  any  p > 0. 

If,  as  in  the  preceding  test,  the  terms  of  the  series  (40)  have  the 
form  f(n),  then  an  approximate  value  of  the  sum  can  be  computed  by 
the  methods  of  Sec.  1.2.  Procedures  for  refining  that  value  will  be  dis- 
cussed later  on. 

Convergent  series  with  terms  of  arbitrary  sign  (not  necessarily 
alternating  as  in  (40))  are  of  two  types. 

(1)  It  may  turn  out  that  both  the  "positive  part”  of  the  original 
series  (that  is,  a series  made  up  solely  of  the  positive  terms  of  the  ori- 
ginal series)  and  the  “negative  part”  converge.  Then  the  original  series 
is  termed  an  absolutely  convergent  series,  since  a series  made  up  of  the 
absolute  values  of  its  terms  also  converges. 

(2)  It  may  turn  out  that  both  the  positive  and  the  negative 
parts  of  the  original  series  diverge  to  infinity,  but  the  series  itself 
converges  due  to  a compensation  of  these  infinities.  Such  a series  is 
called  conditionally  convergent  since  a series  composed  of  the  absolute 
values  of  its  terms  diverges. 

For  example,  recalling  the  series  (39),  we  conclude  that  the  series 
(42)  is  absolute  convergent  for  p > 1 and  conditionally  convergent 
for  0 < p < 1. 

We  can  operate  with  convergent  series  just  as  we  do  with  finite 
sums,  since,  practically  speaking,  the  sum  of  a convergent  series  is 
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simply  equal  to  a partial  sum  having  a sufficiently  large  number  of 
terms.  A complication  arises,  though  it  is  not  so  obvious,  when  we 
rearrange  the  terms  of  a convergent  series.  Such  a rearrangement 
does  not  affect  the  sum  of  an  absolutely  convergent  series,  but  a con- 
ditionally convergent  series  may,  after  such  a rearrangement,  alter  its 
sum  or  even  become  divergent  because  such  a rearrangement  may 
alter  or  even  upset  the  "compensation  of  infinities”  mentioned  above. 
Consider,  for  instance,  the  convergent  series 

i l , l 1,1  l . 

1 “ yl  + VT ~w “ V?+  ■"  () 

and  regroup  the  terms  so  that  one  negative  term  follows  two  positive 
ones: 

i+4=- — L+-L-+-4- — L+-L+4=--L+ ...  (44) 

y 3 y2  ys  y 7 (r  )'9  |ii  /s  v 

The  partial  sum  $3  n of  this  series  (the  number  of  the  term  is  3 n) 
consists  of  a group  of  positive  terms. 


1 + yI+W  + W + 

and  a group  of  negative  terms. 


/An  - 1 


But  the  first  sum  exceeds 

_L  , _L  , _L,  _L 
y’2  y4  'r  y6  ^ ys 


and  so  the  general  sum 


S,»  > 




/ An -2  V An 


/In  + 2 y'2re  + A 


+ ...  + = + -=- 
/An-1  /An 


u \ ■ l I I 1 , M 

/2\/n+l  )'n  + 2 y2«  — 1 y2»  J 

By  virtue  of  (11),  Ch.  1,  the  right  side  is  approximately  equal  to 

2n+— 

71  $ y?*:  = >'1(F2”  + + 


V2n[^2  +i~/1  + i)„- 


CO 
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To  summarize,  then,  the  two  series  (43)  and  (44)  differ  solely  in  the 
order  of  their  terms  but  the  former  converges  and  the  latter  diverges 
to  infinity. 

If,  as  is  often  done  in  practice,  we  want  to  replace  the  sum  of  a 
series  by  the  partial  sum  of  a few  terms,  then  the  series  must  not  merely 
converge  but  converge  rapidly  so  that  only  a small  number  of  terms 
almost  exhausts  the  total  sum  to  the  accuracy  we  want.  For  slowly 
converging  series  (ordinarily,  these  are  conditionally  convergent  series) 
one  cannot  drop  the  remainder  of  the  series  but  must  estimate  it  by 
using  the  methods  of  Sec.  1.2. 

It  may  be  essential,  both  for  convergent  and  divergent  series, 
to  find  the  asymptotic  law  of  variation  of  the  partial  sum  as  n increases. 
This  can  be  done  with  the  aid  of  the  methods  of  Sec.  1.2,  that  is,  using 
formulas  (9),  (11),  (14),  or  (15)  of  Ch.  1,  although  this  results  in  a defi- 
nite error.  For  the  sake  of  simplicity,  we  confine  ourselves  to  series  of 
positive  terms.  Ordinarily,  the  partial  sum  of  a series  is  of  the  form 

f{a)  + /(a  “b  A)  + f{a  + 2A)  + ...  +/(tf  + nh) 


where  the  number  of  terms  increases  due  to  increasing  n,  while  h re- 
mains constant.  Thus,  for  example,  the  sum  S3  = 1 -f-  ~ + -j  yields 
the  sum 


sB=  i +4  + T + 4 + T 


or 


S8=  1 + — + — + — + — + — + — + — 

8 2 3 4 5 6 7 8 


In  such  cases,  as  a rule,  the  absolute  value  of  the  error  does  not  di- 
minish with  increasing  number  of  terms  in  the  sum. 

Let  us  consider  some  examples 

1-  S„=  1 + 3 + 3 + ...  + -■  By  (11)  of  Ch.  1 we  get 

22  3 2 n 2 

B+T 

Sn~  ( — = 2 ^ — • By  (9)  of  Ch.  1 we  obtain 

J *2  1 

1 n + ~ 

2 2 


sB«(  — + 3 + -L  = i.5  -3  + _L 

j x2  2 2 n2  n 2 n2 


1.5  — 


2 n — 1 
2n2 


We  know  that  if  n increases  without  bound,  the  value  of  Sn 
approaches  without  bound  the  number  S = ^ n2x  1.665.  (We 


94 


CH.  3 


More  on  integrals  and  series 

omit  the  proof  here.)  For  very  large  n,  formula  (11)  of  Ch.  1 yields 
Sn  = 2 (error  20%),  formula  (9),  Ch.  1,  Sn  = 1.5  (error  10%). 

Observe  that  in  this  case  it  is  easy  to  refine  the  computation. 
The  reason  for  the  rather  considerable  error  lies  in  the  fact  that  the 
first  terms  of  the  sum  vary  rapidly  and,  hence,  over  the  interval  of 
integration  corresponding  to  the  distance  between  two  successive 
terms  of  the  sum  the  function  f(x)  varies  much  too  nonuniformly 
and  so  the  trapezoidal  formula,  on  which  formulas  (9)  and  (11)  of 
Ch.  1 are  based,  gives  a poor  result. 

It  is  therefore  possible  to  find  the  sum  of  a few  terms  by  direct 
addition  and  then  to  apply  approximate  formulas  to  the  remaining 
sum.  In  our  example,  we  start  by  finding  the  sum  of  the  first  three 
terms  directly: 

S = 1 + 1 + 1 = 1 + 0.25  + 0.111  = 1.361 

d 22  32 


Let 


l . 1 


s;-3  = T.+f.+ •••  + -. 


42  52 


1 


By  formula  (11),  Ch.  1,  we  get  S'n_3x0. 286 » and  by  (9), 

_ 1 

Ch.  1,  we  obtain  Sn_3«  0.281 Therefore 

2w2 


S.w  1.647 


n + - 
2 


by  (11)  of  Ch.  1 and 


Sm«  1.642 


2w  — ■ 1 
2 n* 


by  (9)  of  Ch.  1.  For  n increasing  without  bound,  formula  (11),  Ch.  1, 
gives  5 « 1.647,  and  formula  (9),  Ch.  1,  yields  S « 1 .642.  In  each 
case  the  error  is  less  than  2%. 

2.  Consider  the  sum  of  the  decreasing  geometric  progression 


S n = l -| 1-  — f- ...  + — 

n ' * r *2  ‘ 


where  z > 1.  The  exact  formula,  it  will  be  recalled,  is 


1-1  1 
»W— 1 


1-1 


z — l 


e 
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whence,  for  unlimited  growth  of  n,  we  get  S'  = — — . By  formula 

Z — 1 

(11)  of  Ch.  1 we  have 


= _LLt1dz  _ e~ {”-■?) ln*]  h 

In  £ [ J Inz 

When  n increases  without  bound,  we  have  S « ~r — From  the 

In  z 

table  below  it  is  evident  that  for  z close  to  unity  both  formulas 
yield  almost  the  same  results: 


z 

1.2 

i.5 

2.0 

3.0 

6.0 

20.0 

z 

z — 1 

6.00 

3.00 

2.00 

1.5 

1.20 

1.05 

fz_ 

In  z 

6.00 

3.01 

2.04 

1.57 

1.36 

1.49 

But  if  z > 1,  then  adjacent  terms  of  the  series  differ  substantially 
and  so  the  approximate  formula  produces  poor  results. 

In  some  cases  the  sum  may  increase  without  bound  as  the 
number  of  terms  increases,  despite  the  fact  that  the  terms  of  the  sum 
diminish  without  bound  (cases  of  the  divergence  of  the  corresponding 
infinite  series). 

Let  us  consider  some  examples. 

L s”=1+W+W  + '"+W' 

By  formula  (11),  Ch.  1,  we  get 


”+¥ 


s.»  ( f =2 r* 


= 2)j«+±-2  (4 


By  (9),  Ch.  1,  we  obtain 

“2‘r»- 15 


2/m 


CH.  3 


96  More  on  integrals  and  series 

For  large  n we  find  from  (11),  Ch.  1, 

SnKl^n  — 1.41 

and  from  (9),  Ch.  1,  Sn  — 2 — 1.50.  |Here  we  drop  the  term  pro- 
portional tO  -pr  • j 

\n  f 

A more  exact  formula  (obtained  by  straightforward  addition 
of  the  first  few  terms)  is 

Sna2]fn-  1.466 

2.  In  many  problems  we  encounter  the  sum 


S»=  1 +7  + 4+  - 

2 3 n 


For  large  n,  using  (11)  of  Ch.  1,  we  find  Sn»ln  n + In  2 — In  n + 
+ 0.69,  using  (9)  of  Ch.  1 we  obtain  Sn«lnw  + 0.50.  The  limit 
of  the  difference  Sn  — In  n for  unlimited  increase  in  n is  denoted 
by  C and  is  called  Euler's  constant)  Thus  we  can  write  the  formula 
Sn  = In  n + C + an,  where  an  0 as  n ->oo.  Therefore  the  asymp- 
toticahy  exact  formula  is  of  the  form  Sn  x In  n + C.  We  obtained 
very  rough  values  for  the  Euler  constant,  but  formulas  (9)  and 
(11)  of  Ch.  1 enable  us  to  obtain  a more  exact  value  for  C by 
straightforward  summing  of  the  first  few  terms.  It  turns  out  that 
C = 0.5772  ... 

Now  let  us  examine  sums  whose  terms  increase  with  increasing  n. 
The  sum  here  increases  without  bound  with  the  growth  of  n (i.e. 
with  increase  in  the  number  of  terms).  Two  cases  are  possible  for 
increasing  n. 

1.  The  error  of  the  approximate  formulas  (9)  and  (11)  of  Ch.  1 
decreases  in  absolute  value  with  increasing  n or  if  it  increases,  it 
does  so  more  slowly  than  the  sum  so  that  the  relative  error  dimin- 
ishes. This  case  results  if  the  terms  of  the  sum  increase  more  slowly 
than  a geometric  series,  for  instance,  like  powers. 

2.  With  increasing  n,  the  relative  error  (and  all  the  more  so 
the  absolute  error)  does  not  decrease. 

The  second  case  is  obtained  when  the  terms  of  the  sum  increase 
in  geometric  progression,  that  is,  when  the  sum  is  of  the  form 
Sn  = a + ay  + ay 2 + ay3  + ...  + ayn~1i  where  \y  |>  1,  and  also  if  the 
terms  of  the  sum  grow  faster  than  a progression,  for  example,  Sn  — 
y + y*  + y*  + •••  + y”2-  In  this  case  the  last  term  constitutes  the 
major  portion  of  the  entire  sum.  To  illustrate,  in  the  case  of  the 
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sum  Sn  = y + y4  + y9  + ...  + yn 2 we  give  a table  of  the  values 
of  ~ when  y = 2; 


n 

i 

2 

3 

A 

5 

2ni 

2 

16 

512 

65  536 

33  500  000 

Sn 

2 

18 

530 

66  066 

33  600  000 

Sn 

r2 

1 

1.12 

1.03 

1.01 

1.003 

It  is  evident  from  the  table  that  for  large  n the  magnitude 
of  the  entire  sum  is  practically  determined  by  the  last  term 
alone. 

The  situation  is  similar  for  an  ascending  geometric  series.  Indeed, 
in  the  formula 

5„=  i + * + ... + (|*i  > l) 

z — 1 

we  disregard  unity  in  comparison  with  zn  and  get 


And  so 

sn  zn  z 

« = — constant 

^n-l  zn-i(z  _ !)  , _ 1 

(for  sufficiently  large  n).  Thus,  in  this  case  the  portion  of  the  con- 
tribution of  the  last  term  approaches  a constant,  and  for  large 
\z\  this  portion  is  close  to  unity.  Clearly,  then,  there  is  no 
need  of  summation  formulas  in  this  case,  for  a high  degree  of 
accuracy  can  be  obtained  merely  by  taking  the  sum  of  the  last 
few  terms. 

Summation  formulas  are  useful  when  the  ratio  of  the  sum  to 
the  last  term  increases  with  increasing  n.  Then  the  formula  makes 
it  possible  to  reduce  computations  for  large  n and  we  always  have 
Case  1,  that  is,  the  relative  error  in  the  formulas  (9)  and  (1 1)  of  Ch.  1 
definitely  decreases.  Here  is  an  example. 

From  elementary  algebra  we  have  the  familiar  formula 

Sn=  12  22  + 32  + ...  + n2  = ”(”  + l)  (2n  + 1}  = ”3  | n 

6 3 2 6 


7 — 1634 
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Apply  the  approximate  formula  (11),  Ch.  1,  to  the  sum  Sn  to  get 


»+T 


s„»s;=  J = ^ 


=-+-+- 
3 2^4 


The  absolute  error  is  S'n  — Sn  = — and  hence  increases  with  in* 

n 12 

creasing  n . The  relative  error 

S'n  — Sn  ft  n 1 


Sn 


(w3  ?z2  \ tz3 

- + 12- 
3 2 1 6 7 3 


4w2 


It  falls  off  rapidly  with  increase  n. 

Consider  the  sum 

S<’>=  1®  + 2P  + 3P  + ...  + np  (p  > - 1) 

Applying  (10),  Ch.  1,  we  get 

« 

fl**1  1 


sip>  x{xp  dx  = ^— 
J p + 1 


p + l 


Thus,  for  large  n the  following  simple  but  rough  formula  holds  true : 


P + 1 


Exercises 


1.  Refine  the  value  of  Euler's  constant  C (see  page  96)  by  finding 
the  sum  of  the  first  five  terms  and  the  first  ten  terms  directly. 

2.  Suppose  we  have  a sum 

s„  = + «*  + ...  + 

such  that 

0 < ti 2 < M 2 <1  ^3  <C 

Form  the  sum 

<7n  = 1 4--^  + -^+...  + — 

Un  M„  Un 


When  p is  a positive  integer,  elementary  algebra  permits  writing  an  exact  for- 
mula for  sffl  (incidentally,  we  made  use  of  such  a formula  for  p — 2).  How- 
ever, for  large  p the  formulas  become  unwieldy  and  so  the  rough  formula 
given  here  may  prove  useful  for  positive  integers  as  well. 
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It  is  clear  that  the  terms  of  this  sum  decrease.  How  does  <r„  behave 
when  n increases  if  Sn  is  an  increasing  sum  of  the  first  type?  of 
the  second  type? 

3.6  Integrals  depending  on  a parameter 

Consider  an  integral  of  the  form 

b 

I = (/(*.  *)  dx  (45) 

a 

where,  under  the  integral  sign,  we  have  the  variable  of  integration  x 
and  also  a parameter  (arbitrary  constant)  X,  that  is,  a quantity 
that  remains  constant  throughout  the  integration  but  can  in  general 
assume  distinct  values.  Then  the  result  of  the  integration  will, 
generally,  depend  on  X,  that  is,  I = /(X).  Such  integrals  are  often 
encountered  in  applications  when  the  integrand  includes  certain 
masses,  dimensions,  and  the  like,  which  remain  constant  throughout 
the  integration  process.  Here  are  some  simple  instances: 
i i 

(x2  + Xx)  dx  = ± + ^ sin  ocx  dx  — - - c - - > 

0 0 

1 

(s  + 1)  xs  dx  = 1 (s  > — 1) 

0 

In  the  case  of  proper  integrals  we  have  the  same  properties 
as  are  encountered  in  the  consideration  of  finite  sums  of  functions. 
We  know,  for  instance,  that  the  derivative  of  a sum  of  functions 
is  equal  to  the  sum  of  the  derivatives.  Similarly,  the  derivative  of 
integral  (45)  with  respect  to  the  parameter  is  equal  to  the  integral 

b 

of  the  derivative  with  respect  to  that  parameter : — = [ d^x’  — dx. 

dx  j ax 

a 

Under  the  integral  sign  we  have  here  the  derivative  of  the  function 
f(x,  X),  with  respect  to  X,  taken  for  fixed  x.  A similar  rule  holds 
for  integration  with  respect  to  a parameter: 

f{x,  X)  (fxj  dx 

To  verify  these  simple  rules  we  would  have  to  write  them  out  for 
the  integral  sums  and  then  pass,  in  the  limit,  from  sums  to  integrals, 
but  we  will  not  go  into  that  here. 
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For  improper  integrals  dependent  on  parameters  there  can  be 
complications  due  first  of  all  to  the  possibility  of  their  divergence. 
The  simplest  case  is  that  of  "regularly  convergent"  improper  inte- 
grals. Thus,  an  integral  of  the  type 

oo 

/(X)  = J f(x , X)  dx  (46) 

a 

where  the  function  / itself  is  finite,  is  called  a regularly  convergent 
integral  if  it  is  dominated  by  a convergent  integral  not  dependent 

CO 

on  the  parameter,  that  is,  if  | f(x,  a)  ] ^F(x)  where  ^F(*)  dx  < oo* 

a 

For  example,  the  integral 

f m±dx 

J *2 

1 

is  regularly  convergent  since 

00 

^ — » [~dx=  1 < oo 

x 2 J X2 

1 

The  properties  of  regularly  convergent  integrals  are  the  same  as 
those  of  proper  integrals.* 

In  the  study  of  irregularly  convergent  (and  also  divergent  inte- 
grals of  type  (40))  integrals,  one  often  does  as  follows:  one  cuts  off 
the  singularity,  that  is,  one  passes  to  a proper  integral: 

N 

Av(x)  = X)  dx 

a 

and  then  considers  the  asymptotic  behaviour  of  this  integral  as 
N — > oo.  In  this  way  we  can  justify  operations  on  improper  and 
even  divergent  integrals. 

It  is  very  important  that  the  result  of  performing  various  ope- 
rations — subtraction,  differentiation  with  respect  to  a parameter 
and  the  like  — on  divergent  integrals  may  prove  to  be  a finite  quan- 


sin  \x 


*2 


Here  the  requirement  of  regular  convergence  may  be  replaced  by  the  somewhat 
less  restrictive  requirement  of  uniform  convergence,  which  signifies  that 

| CO  | 


• 0. 


max 

X 


JV-j-oo 
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tity,  in  particular,  it  may  be  expressed  in  terms  of  convergent 
integrals  (the  converse  can  also  occur).  To  illustrate,  let  us  consider  the 
divergent  integral 

00 

/(X)  = { — - — dx  — oo  (X  > 0) 

J * + X 

o 

After  differentiating  with  respect  to  the  parameter  we  arrive  at  the 
convergent  integral 


oo 


00 


dl 

dx 


dx  = 


5 


1 

(*  + X)2 


dx  = 


l 

AT  + X 


£ 

X 


o o 

To  get  to  the  meaning  of  this  equation,  we  cut  off  the  singularity 
to  obtain 


IN(l)  - i— 7- dx  = In  {x  + X) 
J * + X 


N 


In  (N  + X)  — In  X 


Differentiating  we  get 


dlN  _ _l 

dX  N ~\~  X X 

Now,  if  N ->oo,  then  /^(X)  ->oo  and  — - ->• > that  is,  we  ob- 

dX  X 

tain  the  foregoing  result.  Verify  that  the  following  equation  has 
a similar  meaning: 


Hh)  - Hh)  = ( f ^r-l  dx  = In  X2  - In  X,  * 

J Lx  + Xi  x + xj 

0 

We  take  another  example  to  illustrate  the  complication  that 
may  arise  in  the  case  of  irregularly  convergent  integrals.  Consider 
the  integral 

0 


In  the  1950's  physicists  developing  quantum  electrodynamics  began  to  make 
extensive  use  of  an  imperfect  theory  containing  divergent  integrals.  It  has  been 
common  practice  to  compute  quantities  like  the  derivatives  and  differences 
of  these  integrals.  Such  things  turn  out  to  be  finite  and  are  in  excellent  agree- 
ment with  experiment,  which  is  the  ultimate  criterion  of  the  truth. 
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Fig.  22 


It  can  be  verified  (see  Exercise  2)  that  for  X = 1 the  integral  con- 
verges and  is  equal  to  --  tc.  From  this,  using  the  substitution  Xx  = s 
for  X > 0,  we  immediately  get 


oo 


v f sin  s ds 

o " 

X 


00 


0 


At  the  same  time,  when  X = 0,  we  get  1 = 0 and  for  X < 0,  taking 
— 1 outside  the  integral  sign,  we  obtain  I = — ~ tz.  Thus,  in  this 

case  /(X)  has  a jump  discontinuity  at  X = 0.  This  might  appear  to 
be  strange  since  for  a small  variation  of  X the  integrand  varies  to 
an  arbitrarily  small  degree.  However,  a small  change  in  the  integrand 
over  an  infinite  interval  can  lead  to  a rather  substantial  change  in 
the  integral ! Fig.  22  shows  the  discontinuous  graph  of  7(X)  and  the 
graphs  of 

0 


for  different  N . Although  these  latter  graphs  do  not  have  discon- 
tinuities, for  large  N the  transition  from  — ~ n to  -i  tt  takes  place 
over  a small  interval  of  X,  and  the  larger  the  N,  the  smaller  the  interval. 
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In  the  limit,  when  N = oo,  this  transition  is  accomplished  over 
an  infinitely  small  interval  of  X,  which  is  to  say  we  have  a discon- 
tinuity. 

We  shall  return  (in  different  notations)  to  the  functions 

CO  00 

7(X)  = ^ - — — dx  and  *^  = jcos  \xdx 
o o 

in  Sec.  6.3  in  connection  with  the  theory  of  discontinuous  functions 
and  in  Sec.  14.2  in  connection  with  the  Fourier  transformation. 

When  considering  series  whose  terms  depend  on  a parameter, 
the  very  same  questions  arise  as  in  the  study  of  improper  integrals 
dependent  on  a parameter.  The  properties  are  completely  analogous 
and  so  we  will  not  dwell  on  them  here. 


Exercises 


l. 


W 

Starting  with  the  integral  dx  (X  > 0),  use  differentia- 

o 

tion  with  respect  to  the  parameter  for  X = 1 to  obtain  the 

00 

value  of  the  integral  ^ xne~z  dx  (cf.  Sec.  3.3),  and  use  integra- 
o 

tion  with  respect  to  the  parameter  to  evaluate  the  integral 


-X.r 


dx  (note  that  in  the  latter  example  the  indefinite 


integral  is  not  expressible  in  terms  of  elementary  functions). 

oo 

2.  Starting  with  the  integral  sin  x dx  (X  > 0),  integrate 

o 

with  respect  to  the  parameter  from  a to  p and,  setting  j3  ->  oo, 
obtain  the  formula 


$ 


e-ctx 


sin  x 


x 


dx  = — — arctan  a (a  > 0) 
2 


(47) 


(The  value  of  this  integral  for  a = 0 was  mentioned  in  the  text.) 


ANSWERS  AND  SOLUTIONS 

Sec.  3.1 

Straightforward  use  of  the  trapezoidal  rule  or  Simpson's  rule 
is  impossible  since  the  integrand  blows  up  at  x = 0,  and 
so  we  split  the  interval  of  integration  into  two  parts;  from 
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x — 0 to  x = — 7T  and  from  x — — n to  x ~ ~ iz.  For  0 < x < 
6 6 2 ^^ 

< i 7r  the  relation  sin  x « % holds  true  and  so 
6 


/,  a:3  I = 1.5x0.65  = 0.98 

1 J '/sin  x )\x  2 

0 0 


Evaluate  the  integral  /2  = { j^L=  by  Simpson's  rule.  Split 

J ysin  x 


the  interval  into  two  parts  to  obtain  /2«1.13,  whence 


j* 

[ ;p^=«0.98  + 1.13  = 2.11 

J /sin  * 


Sec.  3.2 

1.  The  value  of  the  first  of  these  integrals  has  already  been  obtained 

00 

(see  Problem  2 in  the  Exercises  of  Sec.  1.1).  Let  us  find i xdx  . 

3 + 1 

3 

The  integrand  is  f(x)  = — - — and  so  ff(x)  = . 

ex  -f-  1 (ex  4-  l)2 

Since/' (3)  =/=  0,  we  apply  formula  (19)  to  get 

QO  OO 

C_£^_~0.23,  [ 0.63  + 0.23  = 0.86 

)ex+l  3^+1 

3 0 

Observe,  for  the  sake  of  comparison,  that  the  exact  value  of 
the  desired  integral  (to  two  decimal  places)  is  0.82. 

2.  0.67,  0.59.  (The  exact  value  to  three  places  is  0.612.) 

3.  Here,  y = e~x\  y'  = — 2xe~x\  a = N;  by  (19)  we  find  that 

oo 

( e-*1  dxxie-^y  : 2Ne~N*  = — e~Ki 

3 2 N 


as  was  indicated  on  page  69. 
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The  fraction  D is,  after  the  indicated  transformation,  equal  to 
[2.4.6...  (2 n - 2)] 2 
Trcn  1 • 2 • 3 • 4 ...  (2»  - 3)  (2m  - 2) 

_ [2n~i.  1-2.3...  (n  - I)]2  _ 22"~2[(m  - !)!]« 
(/ nn  (2m  — 2) ! ]/ nn(2n  — 2) ! 


Using  Stirling’s  formula  we  obtain,  for  large  n, 


22n-2 

l — 1) 

' n — 1 ^ 

n — 1 j2 

. . ) 

(In-  T 

e 

^h— 2 

Vttw 

]/  2k  (2n  — 2) 

22”~22 n (n  - 1)  (n  - l)8"-2 
27tKm(m  - 1)  22n~-  (m  - l)2"-2 


1 


Sec.  3.4 

1.  -^(28)  = 


2 sin  co# 
<o(l  + a2)  * 


I{  29)  = I (28)  + 


4a  cos  coa 
to2(l  + a2)2  ’ 


I (30)  = I (28)  — 

4 a cos  coa 


2. 

3. 


6)2(  1 + a2)2 


(32)  +ibw(fw)  sin  (t0?w  + a)] 


6 

x—a 


For  co  = 1 the  exact  value  is  7 = Si  2co  — Si  co  = 0.66 ; approxi- 
mate values  are  7(28)=  0.75,  7(29)=0.13,  7(3o)=1.36.  It  is 
evident  that  the  accuracy  is  quite  unsatisfactory.  For  co  = 10 
the  exact  value  is  7 = — 0.110,  the  approximate  values  are 
7 (28)  ==  — 0.104  ; 7(29)=  — 0.115,  /(3o)  — — 0.112  with  accuracy 
of  the  order  of  a few  percent. 


Sec.  3.5 

1.  SB  = S5  -f  sn_ 5,  where  Sn_ 5 = 2 + 2 + ...  + - • Using  formula 

6 7 n 

« 

(9),  Ch.  1,  we  obtain  — — — + — ~in  m — In  6 + 0.083. 

J # 12  2w 

6 

Since 

S6=l  + - + - + -i  + i = 2.283 

6 2 3 4 5 

it  follows  that 


S„«ln  « — In  6 + 0.083  + 2.283  = In  n + 0.575 


whence  C = 0.575.  Summing  the  first  10  terms,  we  obtain  C = 
0.576. 
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2.  In  the  first  case,  ct„  increases  without  bound  as  n increases  without 
bound,  in  the  second  case,  ct„  approaches  1 without  bound. 

Sec.  3.6 
1.  We  have 


$ 


e dx 


—X 


— — (X>0) 

x=0  X 


(48) 


This  integral  converges  regularly  on  any  interval  a ^ X ^ (3 
where  0 < a < (3  < oo  since  on  such  an  interval  e~Xx  ^ e~ax 

oo 

and  ^ dx  < oo.  Differentiating  with  respect  to  the  parame- 
o 

ter  X,  we  successively  obtain 

CO  CO 

^ (— x ) e~Xx  dx  = (—1)  X-2,  ^ (—  x)2e-Xx  dx  = (—1)  (—2)  X-3, 
o o 

oo 

J (—x)ne-i* dx  = (-1)  (-2)  ...  (-  n)  X-C+1) 

0 

Putting  X = 1 and  cancelling  out  (—1)”,  we  get 

oo 

^ xne~x  dx  = n! 
o 

Integrating  formula  (48)  from  1 to  X with  respect  to  the  para- 
meter, we  obtain 

iii 


e-Xx  d\  I dx 

oM  J 

2.  The  integral 


-$• 


dx  — In  X 


, e~Xx  sin  x dx  = ■ 


X2+  l 


(a  > 0) 


is  evaluated  in  elementary  fashion  with  the  aid  of  the  inde- 
finite integral.  As  in  Exercise  1,  the  integral  converges  regularly 
on  any  interval  a ^ X ^ p (0  < a < p < oo). 

Integrating  with  respect  to  X from  a to  p,  we  get 

oo 

^ (e~ax  — g-P*)  sin  * dx  = arctan  p — arctan  a 
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Passing  to  the  limit  as  p oo,  we  get  formula  (47).  For  a = 0 
this  formula  yields 

7T 
~2 

It  is  to  be  noted  that  formula  (47)  was  derived  for  a > 0 and 
its  validity  for  a = 0 requires  special  substantiation,  which  we 
will  not  dwell  on  here. 


Chapter  4 

FUNCTIONS  OF  SEVERAL  VARIABLES 


Up  to  now  we  have  considered  only  functions 
of  one  independent  variable.  Functions  of  sev- 
eral variables  have  appeared  from  time  to  time 
(see  HM,  Sec.  3.12),  The  theory  of  functions 
dependent  on  a number  of  independent  va- 
riables contains  many  new  features,  and,  what 
is  more,  nearly  all  the  fundamentally  new  fea- 
tures are  revealed  in  functions  of  two  vari- 
ables : 

2 =/(*.  y) 

For  the  sake  of  simplicity  we  will  as  a rule  con- 
sider precisely  this  case. 

The  concept  itself  of  a function  of  two  variables  is  simple 
enough : a quantity  z is  specified  via  a formula  or  table  so  that  to 
every  pair  of  values  x and  y there  corresponds  a definite  value  of  z. 

4.1  Partial  derivatives 

In  the  case  of  a function  of  one  variable,  g = g(x),  a small 
change  dx  in  the  variable  x leads  to  a small  change  in  the  function  g, 
and 

g(x  + dx)  — g (x)  = g'(x)  dx 

In  this  formula,  the  right-hand  side  of  which  is  denoted  by  dg,  we 
neglect  terms  of  order  (dx)2  since  dx  is  a very  small  quantity.  In 
the  case  of  a function  of  two  variables  the  variation  of  the  function 
occurs  as  a result  of  changes  in  both  variables  x and  y and  is  equal 
to  f(x  + dx,  y + dy)  — f(x,  y). 

We  now  show  * that  this  change  consists  of  two  parts,  one  part 
being  proportional  to  dx  and  the  other  to  dy;  that  is, 

f(x  dx,  y + dy)  — f(x , y)  = a dx  + b dy  (1) 

Here,  we  disregard  terms  of  order  (dx)2,  (dy)2,  dx  • dy  since  dx  and  dy 
are  extremely  small  quantities. 


With  certain  alterations,  this  is  a repetition  of  the  reasoning  given  in  HM, 
Sec.  3.12. 
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Partial  derivatives 
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We  write  the  left  part  of  (1)  thus: 

/( x + dx,  y + dy)  —f{x,  y)  =f(x  + dx,  y -f  dy) 

— f(x  + dx,  y)  + f(x  + dx,  y)  — f(x,  y) 

Let  us  consider  the  difference  of  the  last  two  terms,  f(x  dx,  y)  — 
— f(x,  y).  For  every  concrete  fixed  y,  this  difference,  to  within 
terms  of  the  order  of  {dx)2,  is  the  differential  of  the  function  de- 
pending solely  on  x: 

f{x  + dx,  y)  - f(x,  y)  = d)(x_  y)  dx 

OX  y 

where  d^x>  y-  denotes  the  derivative  of  the  function  /(*,  y)  cal- 

dx  y 

culated  on  the  supposition  that  y is  constant.  It  is  called  a partial 
derivative  of  the  function  f{x,  y)  with  respect  to  * with  y held  con- 
stant. It  can  also  be  denoted  by  fx(x,  y).  Similarly,  to  within  terms 
of  the  order  of  ( dy)2, 

f(x  + dx,  y + dy)  — f(x  + dx,  y)  = df{*  + — y)  dy 

oy  x + dx 

where  d^x  + dx*  y ^ is  the  partial  derivative  with  respect  to  y 

dy  x+dx 

with  the  first  argument,  equal  to  x + dx,  held  constant.  It  is 
clear  that 

df(x  + dx,  y)  df(x,  y) 

dy  x+dx  dy  x 

is  the  smaller,  the  smaller  dx  is;  more  precisely, 

df(x  + dx,  y)  _ df(x,  y)  __  ^ ^ 

dy  x+dx  dy  x 

where  a is  a bounded  quantity.  With  the  indicated  simplifications,, 
the  left  side  of  (1)  is 

-d^x>  — dx  + -f  cf.dx\dy 

dx  y L dy  x J J 

— d^x'  y ) | dx  + y ^ dy  -f  ol  dx  dy 

dx  \y  dy  x 

Finally,  noting  that  the  term  a dxdy  may  be  disregarded,  we  find 
that,  to  within  quantities  of  the  order  of  {dx)2,  {dy)2  and  dxdy,  the 
left-hand  side  of  (1)  is  equal  to  a sum,  which  is  denoted  by  df  and 
is  called  the  total  differential  of  the  function  /: 

df=V±A  dx+^^L 

dx  y dy 


dy 


(2) 
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Comparing  this  with  (1),  we  get 


a = 


d/(*.  y) 

dx  y 


=fx{x,  y), 


b = 


3/(*>  y) 

By 


= f'yix>  y) 


constant  in  a calcu 
and  instead 


of -2 


If  from  the  context  it  is  clear  what  quantity  is  assumed  to  be 
ation  of  a partial  derivative,  it  is  not  indicated 

, we  write  — • However,  since  the  variables  in 

dx  | y dx 

the  problem  as  a whole  are  x and  y,  the  derivative  is  written 
with  a round  d,  d,  in  order  to  distinguish  it  from  the  ordinary  deri- 
vative. * From  the  foregoing  it  follows  that  the  quantity  — is  to 

dx 

be  found  as  if  y remains  unchanged  in  the  expression  f(x,  y).  In 
exactly  the  same  way,  we  find  — on  the  assumption  that  only 

dy 

y varies  and  % is  held  constant.  For  example,  if 

f(x,  y)  = x2y3  -f  xey  (3) 

then 


= 2xf  + ey,  ¥=  3x2y2  + xey 

dx  dy 

To  get  a better  picture  of  formula  (2),  consider  the  “plane  of 
independent  variables”,  that  is,  the  ^y-plane.  In  this  plane,  every 
point  M has  definite  coordinates  x and  y and  therefore  this  point 
is  associated  with  a definite  value  of  the  function  f(x , y) ; we  can 
say  that  the  function  assumes  a definite  value  at  every  point  of  the 
plane.  If  we  give  a small  increment  to  x or  to  y or  to  both  variables, 
then  in  the  #y-plane  we  get  points  of  a small  rectangle  MNPQ , 
which  is  shown  in  Fig.  23  enlarged.  The  coordinates  of  the  vertices 
are  indicated  inside  the  rectangle,  and  the  function  values  (with 
second-order  infinitesimals  dropped)  given  outside  the  rectangle; 
/ is  to  be  understood  as  the  value  of  f(x,  y). 

A formula  similar  to  (2)  holds  true  for  any  number  of  indepen- 

df  df  df 

dent  variables ; for  example  df(x,  y,  z,  u)  =Z-dx  + 3Liy  + °-Ldz  + 

dx  dy  dz 

+ — du.  Here  — is  calculated  on  the  assumption  that  y,  z,  u are 

du  dx 

held  constant,  — is  calculated  with  x,  z,  u held  constant,  and  so  on. 

dy 


In  the  formula  — ( ekz ) = kekx  we  also  actually  have  to  do  with  a partial 
dx 

derivative  calculated  with  k held  constant.  However,  there  is  no  need  to  use 
the  partial  derivative  sign  since  A remained  constant  throughout  the  discussion 
of  the  problem. 
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Fig.  23 


We  give  an  example  to  show  that  the  value  of  a partial  deriva- 
tive is  essentially  dependent  on  how  the  other  fixed  variables  are 
chosen.  From  physics  we  know  that  the  energy  of  a capacitor  is 

W = where  q = C<p;  here,  C is  the  capacity  of  the  capa- 

citor, 9 is  the  potential  difference  on  the  faces,  and  q is  the 
quantity  of  electricity,  or  the  charge.  Considering  the  relationship 

W = W(C,  9),  we  get  -^L|  = — ■=—.  Considering  W = W(C,  q), 


we  get 


u~  w -1  (7  yy 

= — — = and  so  — = — - 


Up  to  now  we  have  regarded  the  variables  x and  y as  varying 
independently  of  one  another.  Now  suppose  they  depend  on  a cer- 
tain variable  t,  that  is, 

* = x{t),  y = y{t)  (4) 

These  equations  "parametrically”  specify  a certain  line  in  the  #y-plane 
with  t the  parameter  (see,  for  example  HM,  Sec.  1.8).  The  parameter  t 
may  have  different  physical  meanings,  but  it  is  most  convenient 
to  regard  it  as  the  time,  that  is,  to  assume  that  we  are  considering 
the  path  of  a particle  moving  in  the  #y-plane.  Then  z = f(x , y)  = 
= f{x,  (t),y  (/))  so  that  in  reality  z is  a function  of  the  single  variable  t . 
However  it  is  specified  in  a complicated  fashion.  Let  us  find  its  deri- 
vative — (the  total  derivative  along  the  curve  (4)).*  Since  from 

dt 

(2)  and  (4)  we  get 

dz  = ~dx  + *iy  = * *Lit  + £ dldt=  i*  + * *JL\  dt 


The  reader  is  advised  to  review  the  derivation  of  the  formula  for  the  deriva- 
tive of  a composite  function  of  one  variable  (see  HM,  Sec.  3.3). 
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it  follows  that 


dz dz  dx  dz  dy 

dt  dx  dt  dy  dt 

In  particular,  if  z = f(x,  y),  where  y — y(x),  then  — = — + 

dx  dx 

+ — — • We  see  that  the  value  — of  the  total  derivative  along 

dy  dx  dx 

the  curve  for  given  x,  y depends  not  only  on  the  type  of  function 
/ but  also  on  — > that  is  to  say,  on  the  slope  of  the  curve  to  the 

dx 

#-axis  at  the  given  point. 

Both  these  formulas  are  easy  to  understand  with  the  aid  of 
Fig.  23.  If  during  time  dt  the  point  moved  from  M to  P along  the 
dashed  line,  the  rate  of  change  of  the  function  is 


df 

dt 


V 

dx 


df 

dx  + — dy 
dy 


dt 


df  dx  _^_df  dy 
dx  dt  dy  dt 


To  obtain  the  second  formula,  it  is  necessary  to  put  dfldx  instead 
of  df/dt. 

To  illustrate,  let  us  examine  a case  where  a certain  quantity  u, 
say  the  pressure  or  temperature  of  a gas  flow,  is  determined  at  every 
instant  of  time  t at  every  point  (xt  y,  z)  of  space.  Here  u is  a function 
of  four  variables,  x,  y,  z,  t:  u = u(x,  yt  z , /).  Furthermore,  suppose 
we  have  a law  of  motion  x = x{t),  y = y[t),  z = z{t)  of  a particle  M. 
If  we  examine  the  value  of  u in  M in  the  course  of  the  motion, 
then  this  value  will  be  a composite  function  of  the  time: 

u = u(x(t),  y(t),  z(t),  t) 

The  rate  of  change  of  this  value  of  uy  that  is  the  rate  of  change 
of  u along  the  "path”,  is  equal,  by  virtue  of  the  formula  for  the  deri- 
vative of  a composite  function,  to 

du du  dx  du  dy  du  dz  du  /,-v 

dt  dx  dt  dy  dt  dz  dt  dt 

If  u does  not  depend  explicitly  on  t (we  then  say  that  "the  field 

u is  stationary”),  then  — = 0,  and  we  have  only  three  terms 

dt 

on  the  right.  Thus  they  yield  the  rate  of  change  of  u obtained  solely 
from  the  movement  of  point  M along  the  path  from  one  value  of  u 
to  another  (for  example,  if  u is  the  temperature,  then  it  will  be  due 
to  the  motion  from  a cooler  portion  of  space  to  a hotter  portion, 
and  the  like).  This  is  termed  the  convective  rate.  The  last  summand 
gives  the  rate  of  change  of  the  field  at  a fixed  point  resulting  from 
the  nonstationarity  of  the  field;  this  is  the  local  rate.  In  the  general 
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case,  both  factors  operate,  and  the  rate  of  change  of  the  field  along 
the  path  is  compounded  of  the  convective  and  the  local  rates  of 
change  of  the  field. 

Let  us  return  to  the  case  of  the  function  z — f(x,  y)  of  two  inde- 
pendent variables.  It  is  clear  that  the  partial  derivatives  — and  — 

dx  dy 

of  this  function  depend  on  x and  y themselves.  We  can  therefore 
find  their  partial  derivatives.  These  are  called  partial  derivatives 
of  the  second  order , or  second  partial  derivatives ; their  partial  deriva- 
tives are  partial  derivatives  of  the  third  order , or  third  partial  deriva- 
tives, and  so  on.  We  can  form  the  following  second  partial  derivatives: 

jn fdz_\  _a  fdz\ 

dx  \ dx  | dy  [dxj  dx  \ dy  J dy  [ dy  j 

or,  in  different  notation, 

n dfz  ,,  d2z  ,,  d2z  ,,  d2z 

dx2  xy  dxdy  yX  dydx  yy  dy2 

The  derivatives  - d z and  - ^z-  are  called  mixed  derivatives . 

dxdy  dydx 

For  example,  in  (3) 

Ti  = T (2xy3  + *)  = 2y»,  f (2 xy*  + ev)  = 6 xyV, 

ox*  dx  dxdy  dy 

tt=~  <3x2y2  + xeV)  = 6xy2  + 

dy  dx  dx 


— = — (2>x2y2  + x&)  = 6x2y  + xev 
dy2dy'J  1 J 

We  see  that  in  this  example,  - d z - = -d --  - , that  is,  the  mixed  deri- 

dxdy  dydx 

vatives  do  not  depend  on  the  order  in  which  the  differentiation  is 
performed. 

We  will  show  that  mixed  derivatives  are  equal  in  the  general 
case  as  well.  Note  that  by  the  definition  of  a derivative  (see  Sec.  2.2) 


d2z 


dxdy 


y=y0 


=-(-11  = 

dy[dx  J\x=x0 


=y*+fc 


dz 

dx 


x=x0 

y=y0-k 


2k 


(to  any  degree  of  accuracy)  provided  that  k is  sufficiently  small. 


To  be  precise,  the  derivative  is  the  limit  to  which  the  right  member  tends  as 
k tends  to  zero.  Similar  formulations  apply  to  the  formulas  that  follow. 
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In  the  same  way, 

a s 
dx 


_/(*0  + A.  y0  + k)  — f(x 0 — h,  yQ  + *) 
2 h 

y=y<»+* 


if 

a* 


__/(*o  + h,  y0  — 6)  — f(x0  — h,  y0  — k) 

x=x0  2 h 

y-y0-ft 


Substituting  these  expressions  into  the  formula  for  the  mixed 
derivative,  we  get 


I f(x o + h>  yp  4~  j)  ~~  /(*0  ~~  yp  + j) 

a*a>/  4a* 

y=y, 

/(fo  + h>  yQ  - k)  - f(*0  - h>  y a - j)  /g\ 
4AA  V ' 


In  the  same  manner  we  now  obtain 

dz  I 


a? 


a2* 


where 


dydx 

dz 

dy 

if 

a^ 


y=>'o 


=— f— )i  , 


a>/  ]*=*,-{-*  a^V  | X=x0—h 

y=y0 y=y» 

2 h 


y=*y„ 


**=*0-}-A 

y-y# 


x=x9  —h 

y=y» 


_/(*o.  4-  h,  y0  + ft)  f(xp  + K y0  - k)  > 
2k 

_ /(* o - h,  y0  + k)  - /(*0  - h,  y0  - A) 
2k 


Finally  we  get 

d2z 
dydx 


_ f(x o + A,  y0  + AQ  - f(x0  + k,  y0  - k) 

4-hk 

y=y« 

f {Xq  -h,yQ  + k)  -f(x Q — h,  y0  — k) 

Ahk 


(?) 


Comparing  (6)  and  (7),  we  see  that 


a2* 


a2* 


dydx 


y=y0 


a^a^ 


l*=*. 

y-=y# 


and  since  the  point  ( x0 , y0)  is  quite  arbitrary,  these  derivatives  are 
equal  for  all  values  of  x and  y. 

Similarly,  for  derivatives  of  any  order  of  functions  of  any  num- 
ber of  variables,  the  essential  thing  is  the  number  of  differentiations 
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to  be  performed  with  respect  to  a given  variable  and  not  the  order 
in  which  they  are  performed. 

To  get  a better  idea  of  the  meaning  of  a mixed  derivative,  locate 
in  the  plane  of  independent  variables  those  four  points  associated 
with  the  values  of  the  function  that  enter  into  the  right  members 

d2~ 

of  (6)  and  (7).  Compare  (6)  and  (7)  with  similar  expressions  for  — 

dx* 

and 

dy2 

Exercises 

1.  z = x 2 + y2.  Find  — for  x — 1,  y — 1,  and  — for  x — 2, 

J dx  . d? 

y — 0.5.  Find  the  partial  derivatives  of  the  following  functions: 

2.  z — e~(x2+y2h  3.  z = xey  + yex . 4.  z — xsiny.  5.  z = sin  (xy). 

6.  z — y x2  +y2. 

Find  the  total  derivatives  of  the  following  functions: 

7.  z — x2  — y2,  x = t + y > y = t + |/7. 

8.  z — ex~y,  x = sin  t,  y ~ t2. 

9.  z — x3  3xy2,  x = t2,  y — e 

10.  Find  — if  z = ln(^  + <?),  y = 

dx 

11.  Verify  the  equation 

d2z  d2z 

dxdy  dy  dx 

for  all  functions  of  Exercises  2 to  6. 

4.2  Geometrical  meaning  of  a function  of  two  variables 

A function  of  two  variables,  z = f(x,  y ),  is  conveniently  visual- 
ized geometrically  as  an  equation  of  a surface,  z = f(x,  y),  where 
z is  the  altitude  and  x and  y are  coordinates  of  a point  in  the  hori- 
zontal plane  (Fig.  24).  Since  it  is  difficult  to  depict  a surface  on  a 
flat  drawing,  we  will  represent  a function  of  two  variables  graphi- 
cally by  constructing  a set  of  curves  representing  sections  of  the 
surface  z — f(x,  y)  made  by  planes  parallel  to  the  xz- plane.  These 
planes  are  perpendicular  to  the  y-axis  and  the  y-coordinate  is 
constant  in  each  plane.  The  intersection  of  a given  plane  with  the 
plane  z — f(x , y)  yields  a curve,  z =f(x,  y — constant).  To  make 
this  pictorial,  plot  several  such  curves  in  one  figure  in  coordinates  * 
and  z.  We  thus  obtain  a family  of  curves.  Fig.  25  shows  a number 
of  such  curves  of  a family  for  the  hemisphere  z = f 16  — x2  — y2. 
Each  curve  is  labelled  with  the  value,  y = yni  that  corresponds  to  the 
curve.  If  we  have  a family  of  curves  corresponding  to  y = constant 
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o 


X 


in  the  ^-plane,  then  the  derivative  — signifies  geometrically 

dx 

(exactly  like  — does  in  the  case  of  one  variable)  the  slope  of  the 

dz 

tangent  line  to  the  #-axis.  The  derivative  — may  be  found  by 

dy 

forming  the  ratio  Zn+1^  ~~ Zn^  ; here,  two  adjacent  curves  are 
yn+i— yn 

needed. 

Naturally,  the  function  z = f(x,  y)  can  be  represented  graphi- 
cally if  we  construct  the  graphs  of  z(y)  for  x = constant  in  the 
y^-plane. 

A fundamentally  different  way  of  describing  a function  z = 
= f(x,  y)  is  obtained  if  we  construct  a section  of  the  surface  z = 
= f(x,  y)  by  horizontal  planes  z = constant  and  then  plot  the  re- 
sulting curves  (so-called  level  lines)  on  the  *y-plane.  This  is  the 
method  used  for  indicating  relief  on  maps.  Fig.  26  illustrates  such 
a representation  for  the  function  z = x2  — y2.  Each  curve  corres- 
ponds to  a definite  constant  value  of  z. 

How  does  one  find  derivatives  on  a graph  of  this  kind?  A portion 
of  the  graph  is  shown  enlarged  in  Fig.  27.  It  is  clear  that  if  the  level 
lines  are  drawn  close  enough  together,  then,  to  any  degree  of  accuracy, 

zn+\  — zn  &z  zn+\  zn 

dx  xb  - *a  dy  ye  - ya 

The  closer  together  the  lines  with  distinct  values  of  z , the 
greater  are  the  partial  derivatives.  (It  is  assumed  that  these  lines 
are  drawn  for  equal  intervals  with  respect  to  z,) 
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Exercises 

Construct  a family  of  curves  corresponding  to  y = constant 
for  the  following  functions: 

1.  z — xy.  2.  z = x2  — y2.  3.  z = ]j  xz  + y2. 

4.  Construct  a family  of  curves  corresponding  to  z = constant  for 
the  function  z = x2  — y2. 

4.3  Implicit  functions 

It  is  often  useful  to  apply  the  concepts  of  the  theory  of  functions 
of  several  variables  when  investigating  functions  of  one  variable. 
(This  was  illustrated  in  HM,  Sec.  3.12,  but  here  we  will  discuss 
it  in  more  detail.)  Such  a function  is  usually  specified  by  means  of 
an  "explicit”  formula,  for  example,  y = ax2  or  y = beax  and  the 
like,  in  which  it  is  directly  indicated  how  to  evaluate  y for  a given 
value  of  x.  Such  formulas  are  best  suited  for  performing  mathema- 
tical operations. 

An  alternative  method  of  defining  y as  a function  of  x is  called 


defining  the  function 

implicitly , thus: 

For  example, 

f[x,  y)  = 0 

(8) 

or 

cs  + y3  + 2>axy  = 0 

(9) 

and  so  on. 

(x  + y)3  — b2{x  — y)  = 0 

(10) 

In  this  case,  to  find  y for  a given  x it  is  necessary  to  solve 
(8)  for  y.  The  solution  of  the  equation  is  often  in  a much  more  compli- 
cated form  than  formula  (8).  Take  (9)  for  instance: 


*-^-4+y4+.w+jf-4-y4+»v 

It  often  happens  that  the  solution  cannot  even  be  written  as  a 
formula  and  the  equation  (8)  can  only  be  solved  numerically. 

Still,  there  are  cases  where  it  is  sufficient  to  resort  to  a func- 
tion of  only  one  variable  in  order  to  investigate  the  functional  rela- 
tionship at  hand.  Such  is  the  case  when  the  equation  is  solved 
or  is  readily  solvable  for  #,  that  is,  when  it  can  be  written  as 

*=?(y)  (n) 

and  also  when  the  equation  is  solvable  parametrically  (this  will  be 
discussed  later  on).  To  illustrate,  take  example  (9).  Here  y is  ex- 
pressed in  terms  of  x in  an  extremely  complicated  fashion  but  it 
is  easy  to  find 
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Fig.  28 


Then  by  specifying  various  values  of  y,  it  is  easy  to  find  the 
corresponding  values  of  x,  which  can  be  tabulated  and  plotted  as  a 
graph  in  terms  of  x - and  y-coordinates.  (For  a = 1,  c = 1.44,  see 
Table  4 and  the  graph  in  Fig.  28). 

Table  4 


y 

X 

y 

X 

y 

X 

-4.0 

-5.08 

-0.5 

1.91 

1.5 

-1.42 

-3.0 

-2.66 

-0.25 

3.98 

2.0 

-1.83 

— 2.0 

-0.83 

0.25 

-4.02 

2.5 

-2.48 

-1.5 

-0.084 

0.5 

-2.02 

3.0 

-3.34 

-1.0 

-0.67 

1.0 

-1.33 

4.0 

-5.58 

Having  constructed  the  curve,  we  can  find  y from  a given  x.  To 
do  this  (see  Fig.  28),  draw  a vertical  line  corresponding  to  the  re- 
quired x and  find  the  desired  value  of  y.  * If  we  have  a table  of 
values  of  x for  certain  values  of  y,  then  for  a y that  corresponds 


From  Fig.  28  it  is  evident  that  certain  values  of  x are  associated  with  one  value 
of  y,  and  some  values  of  x are  associated  with  three  values  of  y,  and  there 
is  one  value  of  x that  corresponds  to  two  values  of  y.  This  is  due  to  the  fact 
that  an  equation  of  the  third  degree  can  have  one,  two  or  three  real  distinct 
roots  (if  there  are  two,  one  of  them  is  a double  root,  and  it  corresponds  to  tan- 
gency  of  the  straight  line  x — constant  and  the  curve). 
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to  a value  of  x not  in  the  table  we  can  use  linear  interpolation 
(cf.  Sec.  2.1). 

Here  is  how  this  is  done.  Take  two  pairs  of  numbers  from  the 
table,  (yl9  xx)  and  (y2,  x2),  such  that  xx  < x < x2 . We  assume  that 
when  x varies  from  x1  to  x2,  the  graph  of  the  function  differs  only 
slightly  from  the  straight  line  y — kx  + b.  The  numbers  k and  b 
are  readily  determined  from  the  fact  that  y = yx  when  x = x1 
and  y — y2  when  x = x2.  Indeed,  yx  = kxx  + 6,  y2  = kx2  + b , 

whence  ft  = **-»!* 

*2  — *2-  *1 
For  this  reason 

y __  V2-y\  x + *&i  - *i  y2 
x2  - xt  x2  XL 

The  values  of  y for  all  x located  between  xt  and  x2  may  be  com- 
puted from  the  formula  thus  obtained.  The  closer  together  the  num- 
bers xt  and  x2  and  also  yt  and  y2,  that  is,  the  closer  together  the 
points  are  in  the  table,  the  more  exact  is  the  formula  for  y. 

How  does  one  find  the  derivative  — in  that  case?  There  is 

dx 

no  need  to  solve  the  equation  and  find  y = f{x ).  If  the  function  is 

given  in  the  form  x = <p (y),  then  dx  = — dyt  whence 

dy 

dy  1 1 

dx  ^9  9 '(y) 

dy 

The  only  drawback  to  this  expression  is  that  the  derivative  is 
given  as  a function  of  y.  And  so  if  it  is  necessary  to  find  — for  a 

dx 

given  x , then  again  we  first  have  to  find  the  y corresponding  to 
a given  x,  and  only  then  substitute  the  y into  the  expression 

dy  1 

dx  9 '(y) 

At  first  glance  it  would  seem  that  it  is  easier  to  determine  the 
derivative  numerically,  as  in  Sec.  2.2,  for  if  we  can  determine  y 
for  a given  x,  then  we  can  also  find  y(x  + A*)  corresponding  to 
the  quantity  x + Ax,  after  which  we  can  approximate  the  deriva- 
tive by,  say,  the  formula 

y(x  + Ax)  ~ y(x) 

Ax 

Actually,  this  procedure  is  much  worse:  we  would  have  to  find  y 
for  two  distinct  values  of  x , that  is,  we  would  have  to  solve  equation 
(11)  twice.  Also,  it  is  necessary  to  determine  y to  a high  degree  of 
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accuracy  because  we  need  the  difference  of  two  close-lying  values 
of  y,  and  a small  error  in  each  of  them  can  produce  a substantial 
relative  error  in  their  difference.  In  the  expression  for  the  derivative, 
the  denominator  contains  the  small  quantity  Ax  and  therefore  even 
a slight  error  in  the  numerator  can  lead  to  an  appreciable  error  in 
the  derivative.  Hence  it  is  better  to  use  the  formula 

dy  __  1 

dx  <p'(y) 

It  is  well  to  observe  that  the  formula  for  the  second  derivative 
is  much  more  complicated.  Don't  think  that 

d*y  ___1 1__ 

dx2  d2x  d2q> 

dy2  dy2 


Actually,  this  formula  is  incorrect  dimensionally:  d2y/dx 2 has  the 
dimensions  of  y/x2,  while  1/  ^-|has  the  dimensions  of  y2/x . The 
proper  formula  is  obtained  thus: 

d2x 
dy2 


d (dy 

\ 

d (dy 

1=— -f— 1 = 

dx  \dx 

J dx 

dy  \dx 

[<p'(y)J3 

er 

It  is  easy  to  verify  the  dimensionality  of  this  expression:  it  is 
the  same  as  that  of  the  ratio  - — — , which  is  as  it  should  be. 

(xjy)2  x2 

When  considering  the  implicit  functional  relationship  (8),  we 
can  arrive  at  a more  general  case  where  both  variables  x and  y are 
expressible  as  a function  of  one  and  the  same  auxiliary  vari- 
able t: 

X = x(t), 

y = y({) 

For  instance,  if  in  (10)  we  put 

x + y = t (12) 

then  from  (10)  we  get  x — y = t3lb2,  whence,  invoking  (12),  we  find 

t , t 3 

X f 9 

2 2b2 

— 

y ~ 2 2 b2 
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This  is  the  so-called  parametric  representation  of  a function 
(the  variable  t is  the  parameter ).  This  representation  can  be  ob- 
tained without  involving  relations  like  (8).  It  is  considered  in  ele- 
mentary calculus  text  books  (see,  for  example,  HM,  Secs.  1.8,  3.3, 
6.M).  It  is  shown  there  how  to  find  the  derivative  of  such  a func- 
tion: write  dx  as  dx  = — dt  = x'(t)  dt,  and  similarly,  dy  = — dt  = 

dt  dt 

= y'(t)  dt,  whence  — — • 

' dx  *'(<) 

Let  us  find  the  second  derivative  — . We  take  advantage  of 
the  relation 

dz  dz  dt  dz  1 


Therefore 


whence 


dx  dt  dx  dt  dx 
dt 

dll  — A[di\—  d (y'M)  — d(y'W) dt 

dx 2 dx[dx)  dx\  x'[t)  J dt  ( x'(t)  J dx 

d*y  _ 1 _ d l y"(t)  y'(t)  x"(t) 

dx*  x’(t)  ’ dl[x'(t))  {x'(t)f  [,x'(t)f 


where  the  primes  indicate  derivatives  with  respect  to  t. 

In  the  general  case  of  implicit  representation,  the  function  f(x,  y) 
may  be  such  that  the  equation  (8)  is  not  solvable  either  in  the 
form  (11)  or  in  parametric  form.  Let  us,  for  instance,  consider  y 
as  a function  of  x specified  by  the  formula 

y5  + xy  + x5  — 7 = 0 (13) 

(Geometrically,  this  formula  specifies  a curve  in  the  ^y-plane.) 
From  this  it  is  impossible  to  obtain  the  expression  y = f(x).  To  find 

the  derivative  — , equate  the  derivatives  of  both  members  of 

dx 

(13),  assuming  y to  be  a function  of  x,  y — y(x),  determined  from 
this  equation.  With  such  an  interpretation  it  is  an  identity  and 
therefore  admits  differentiation: 


5y*y'  + ( 1 • y + x • y')  + 5#4  = 0 
From  this  we  get 

V'  = dJL=  _y  + 5xi  (H) 

dx  5y 4 + x 

We  stress  that  the  x and  y on  the  right  are  not  independent  but 
are  connected  by  the  relation  (13),  so  there  is  only  one  independent 
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variable.  To  find  y9  for  a given  x , solve  (13)  numerically  and 
find  y for  a given  x and  substitute  this  y into  (14). 

We  now  take  up  the  general  case  of  an  implicit  function  of 
several  (say  two)  variables. 

If  z is  given  as  a function  of  x and  y,  then,  when  solving  the 
equation  z — f(x,  y)  for  x,  we  can  obtain  ^ as  a function  of  y 
and  z:  x = <p(y,  z) ; and  when  solving  the  equation  z = f(x,  y)  for  y, 
we  can  find  y — ty(x,  z).  However,  irrespective  of  whether  we  have 
solved  the  equation  z = f(x,  y)  for  x or  not,  this  equation  yields  x 
as  a function  of  y and  z.  This  is  called  an  implicit  representation  of 
the  function  x . 

The  derivatives  — and  — may  be  found  without  expressing  x 

dy  dz 

explicitly.  This  is  how  it  is  done.  From  the  relation  z = f(x,  y)  we 
find  dz  = — dx  + — dy,  whence 

dx  dy  J 


dx  = 


(15) 


On  the  other  hand,  if  x = <p {y,  z),  then. 

dx  = — dz  + — dy 

dz  dy  J 

Comparing  expressions  (15)  and  (16),  we  get 


dx 

| 1 

dz 

| y dz 

dx 

y 

dz 

dx 

dy 

X 

dy 

z dz 

dx 

y 

(16) 


(17) 


(18) 


Formula  (17)  appears  to  be  quite  natural.  In  (18)  we  find,  sur- 
prisingly, a minus  sign.  Geometrical  reasoning  will  convince  us  of 
the  truth  of  this  formula. 

Consider  Figs.  26  and  27.  In  these  figures  are  plotted  the  curves 
z — constant,  so  that  — is  the  derivative  — taken  in  ordinary 

dy  % dy 

fashion  along  these  curves.  Therefore 
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Noting  that  yb  = ya,  xe  = xa,  we  get 


dx 

dy 


xb  — xa 

ya  - yc 


Recalling  the  values  of  — and  — , we  see  that 

dx  dy 


dzj  dz 
dyj  dx 


Xb  ~ *a 


yc-  ya 

Formula  (18)  may  be  written  thus: 

dx 
dy 


dx 

dy 


. dz 

dz 

z dx 

y fy 

= il* 

dz 


we  find  that 


dx 

dy 

dz 

dy 

Z 

dz  x 

dx 

= -1. 


Noting  that  — 

dy 

Like  (18),  the  latter  formula  shows  that  in  contrast  to  the  case 
of  a composite  function  of  one  variable  we  cannot  cancel  dx,  dy, 
dz  in  the  numerator  and  denominator.  The  point  is  that  in  this 
formula  the  three  partial  derivatives  are  computed  under  different 
conditions  (z  is  held  constant  in  the  first  case,  x in  the  second,  and 
y in  the  third). 

In  Sec.  4.1  we  saw  that  the  value  of  the  derivative  — depends 

dc  r 

on  which  argument  is  regarded  as  being  constant  during  the  compu- 
tation. This  alone  shows  that  dW  and  dC  cannot  be  dealt  with 
like  numbers  and  cannot  be  cancelled  with  d W and  dC  in  other  formu- 
las without  taking  into  account  the  conditions  under  which  the 
appropriate  partial  derivatives  are  evaluated. 

Here  is  an  example  to  illustrate  the  calculation  of  derivatives 
of  implicit  functions. 

Let  2 ==  x*  + px  + y2  + qy  + kxy. 

In  order  to  express  x in 


Find  the  derivatives  — and  dx 

dz 


dy 

terms  of  y and  z,  it  is  necessary  to  solve  a quadratic  equation. 
There  are  no  fundamental  difficulties  here,  but  the  expression  is 

cumbersome  involving  roots.  Since  — = 2x  p ky,  — = 2 y + 

dx  dy 

+ q + kx,  it  follows,  by  (17)  and  (18),  that 


dx 

~dz 


1 


2x  + p + ky 

To  determine  the  values  of 


dx 

dy 


2 y + q + kx 
2x  + p + ky 


and  — for  concretely  given  y and 

dz  dy 

z,  one  has  to  know  the  numerical  value  of  x for  these  y and 
z,  but  one  does  not  necessarily  have  to  have  the  analytic  expression 
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x = ^(y,  z).  It  is  quite  obviously  much  simpler  to  numerically 
solve  the  equation  z = f(x,  y)  for  x , given  definite  values  of  y and 
z , than  to  construct  the  general  formula  x = <]>(y,  z).  Besides,  in 
the  case  of  an  equation  of  higher  than  fourth  degree  and  also  an 
equation  involving  transcendental  functions,  the  construction  of  a 
general  formula  is  at  times  impossible. 

The  most  general  method  for  implicitly  representing  a func- 
tion of  two  variables  is  by  specifying  it  via  a relation  of  the  form 
F(x,  y,  z)  = 0.  We  leave  it  to  the  reader  to  differentiate  this  rela- 
tion, obtain  expressions  for  the  partial  derivatives  — > — , and  so 

dx  dx 

on  and  apply  the  result  to  the  relation  considered  above, 
2 =/(*>  y),  by  rewriting  it  as  z—f(xt  y)  = 0. 

The  foregoing  devices  are  conveniently  used  in  the  case  of  a 
function  of  one  variable  when  it  is  defined  implicitly  with  the  aid 
of  an  equation  of  type  (8).  To  do  this,  consider  a function  of  two 
variables,  z = f(x,  y),  in  other  words,  consider  the  relation  (8)  as  the 
equation  of  the  zero  level  line  (that  is,  one  corresponding  to  the 
value  /=  0)  of  the  function  z = f(x,  y).  Then  the  problem  of  eva- 
luating — reduces  to  the  foregoing  problem  of  computing  — 

dx  dx  const. 

To  illustrate,  in  (13)  we  have  to  put 

z = y5  + + x5 

Then  by  formula  (18) 

dy  dy  dz  fdz  y + 5x 4 

dx  dx  z dx  y j dy  x 5y*  -f  x 

which  is  to  say  we  again  arrive  at  (14). 

Exercises 

1.  Find  — and  — for  the  following  functions  given  parame- 

dx  dx2 

trically: 

(a)  x = y = /2  + t;  (b)  x = 2 sin3/,  y = 2 cos3/  ; 

(c)  x = cos  / + / sin  /,  y = sin  / — t cos  /. 


2. 

x = sin  /,  y = cos  2/.  Find 

dy  , 1 

— for  x = — • 

dx  2 

3. 

2 = x3  + y3  + xy.  Find  — i 

. — for  y = 0,  z 

dz 

dy 

4. 

z = x5  + xy2  + y5.  Find  — 

dx 

* — • 

dz 

dy 

5. 

x2  -j-  y2  — 4x  — lOy  = —4. 

Find  — • 

dx 

6. 

x4y  + xy4  — x2y2  —1=0. 

Find  QL  • 

dx 
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An  interesting  example  of  a function  of  two  variables  is  the 
current  j passing  through  the  anode  (plate)  of  a three-electrode 
(grid-controlled)  electron  tube  (Fig.  29). 

The  flow  of  electrons  to  the  anode  is  influenced  by  the  grid 
potential  uG  and  the  potential  of  the  anode  (the  cathode  is  ground- 
ed so  that  it  has  potential  zero).  We  thus  have  to  do  with  a function 
of  two  variables  j = j(uG,  uA).  Ordinarily  the  graph  is  in  the  form 
of  a family  of  curves  in  terms  of  the  coordinates  uG,  j with  con- 
stant values  of  uA  on  every  curve.  Fig.  30  illustrates  such  a fa- 
mily for  the  Soviet  tube  6 ClZh.  The  current  j is  given  in  milliam- 


dj 

peres,  the  voltages  uG  and  uA  in  volts.  The  derivative  — is  pro- 

duG  UA 

portional  to  the  slope  of  the  tangent  lines  to  the  curves  in  Fig.  30. 
Over  rather  considerable  ranges  of  uG,  these  curves  differ  only 
slightly  from  straight  lines.  Their  slope  over  this  interval  is  termed 
the  transconductance  of  the  tube  and  is  denoted  by  5 (other  symbols 

are  also  used,  for  example  gm , so  that  5 = — 5 is  expressed  in 

duG 


milliamperes  per  volt. 
We  form 

duA 


The  reciprocal  quantity. 


R = 


duA 


\UQ 


duA 


dj 


\UQ 


is  called  the  anode  resistance . The  meaning  of  this  term  stems  from 
the  fact  that  if  the  tube  in  our  circuit  were  replaced  by  a fixed 

resistor  Rlf  then  by  Ohm's  law  j = — > whence 

dj  ^ 1 
duA  Ri 
or 


duA 


When  the  current  is  expressed  in  milliamperes  and  the  potential  in 
volts,  then  R is  obtained  in  kilohms.  Finally,  a very  important 


characteristic  of  an  electron  tube  is  the  quantity 

duG 


This  is  a 
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Let  us  find  the  relation  between  the  variation  of  grid  potential 
and  that  of  the  anode  potential.  Regarding  uA  as  a function  of 
the  variables  uG  and  j,  we  find 

duA  — dj  + duGi  whence  duA  = R dj  — fx  duG 

dj  UG  dliQ  j 

If  the  anode  current  is  kept  constant,  then  dj  = 0,  and,  hence, 
duA  — — [iduG . Thus,  the  change  in  anode  potential  is  fx  times 
that  of  the  grid  potential.  We  can  therefore  say  that  the  tube 
amplifies  the  grid  potential  fx  times,  whence  the  name  amplifica- 
tion factor  for  tx. 

The  relation  — — jx  refers  to  the  ideal  case  when  j 

duG 

remains  constant.  Actually,  however,  a change  in  uG  ordinarily  causes 
a change  in  the  current,  and  the  change  in  uA  is  less  than  in  the 
case  of  j — constant.  Consider  the  circuit  shown  in  Fig.  29  where 
we  have  a resistance  r in  the  anode  circuit  and  a constant  voltage 
u0.  Then  by  Ohm's  law 

j = = j(uA,  uG) 

r 

where  the  right-hand  member  is  a function  that  we  examined  at 
the  beginning  of  this  section.  Let  us  take  the  total  differential  of 
the  right  and  left  sides  of  the  equation,  assuming  u0  and  r to  be 
constant : 

— - duA  = — — duA  4 — — duG 

r duA  duG 

or 


From  this  formula  it  is  clear  that  in  the  circuit  of  Fig.  29  the  abso- 
lute value  of  the  ratio  is  less  than  tx  since  — - — < 1 . If  r P R, 

duG  r + R 

then  the  current  j is  almost  constant  and  is  very  close  to  p. 

duG 

But  a high  voltage  u0  is  needed  to  drive  the  given  current  through 
the  large  resistance  rt  with  a substantial  portion  of  the  electric 
power  going  to  heat  up  the  resistor  r.  It  turns  out  that  in  all  cases, 
except  for  amplifying  a constant  or  slowly  varying  voltage,  an  in- 
ductor (coil)  is  better  than  a resistor. 
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The  facts  presented  in  this  section  show  that  a precise  mathe- 
matical statement  of  the  laws  governing  the  operation  of  an  electron 
tube  is  connected  with  functions  of  two  variables  and  the  main 
quantities  describing  a tube  are  partial  derivatives. 

Exercise 

Use  Fig.  30  to  find  the  values  of  S,  R and  jjl  for  a 6 CIZh 
radio  tube. 

4.5  Envelope  of  a family  of  curves 


Another  illustration  of  the  use  of  partial  derivatives  is  the  pro- 
blem of  finding  the  envelope  of  a one-parameter  family  of  curves. 
Let  us  examine  this  problem. 

Consider  a collection  of  flight  paths  (trajectories)  of  shells  fired 
from  one  point  (the  coordinate  origin)  with  one  and  the  same  ini- 
tial velocity  v0  but  at  different  angles  9 ranging  from  0°  to  180°^ 
(Fig.  31). 

Each  trajectory  is  a curve  in  the  ;ry-plane,  that  is,  it  is  described 
by  a specific  relationship  y(x ).  Writing  down  the  relations  x =x(t) 
and  y = y(t),  disregarding  air  resistance  and  eliminating  t,  we  easily 
find  (see,  for  example,  HM,  Sec.  6. 14)  y as  a function  of  x : 


y — x tan  9 x2 

2v g cos2  <p 


(19) 


There  is  a definite  curve  for  each  concrete  value  of  9.  Regarding  a 
variety  of  values  of  the  parameter  9,  we  obtain  a family  of  curves. 
We  can  take  it  that  the  height  y of  the  trajectory  of  a shell  is  a 
function  of  two  variables:  the  horizontal  distance  x and  the  angle  of 
departure  9,  or  y — y(x,  9).  Then  separate  trajectories  yield  y as 
a function  of  x for  constant  9.  (The  family  of  trajectories  is  depicted 
in  Fig.  31.)  Considering  this  family  of  trajectories,  it  is  easy  to 
see  that  they  fill  one  portion  of  the  #y~plane  and  do  not  appear 
at  all  in  the  other  portion.  Thus,  there  is  a safe  zone  in  which  no 
shells  fall,  no  matter  what  the  initial  velocity  or  the  angle  of  depar- 
ture. 

Let  us  try  to  define  the  boundary  of  the  effective  (killing)  zone. 
(In  Fig.  31  the  boundary  is  shown  dashed.)  Each  point  of  the 
boundary  is  also  a point  of  one  of  the  trajectories,  otherwise  the 
boundary  would  not  be  reached.  It  will  be  seen  that  for  the  boundary 
shown  in  Fig,  31,  each  point  is  a point  of  a trajectory  correspond- 
ing to  an  angle  of  departure  9 ^ 45°.  (It  can  be  demonstrated  that 
for  9 = 45°  we  attain  the  greatest  possible  firing  range.  Trajectories 
corresponding  to  9 < 45°  are  tangent  — as  will  be  seen  below  — to 
the  boundary  of  the  effective  zone  below  the  #-axis.  This  portion 
of  the  boundary  is  of  practical  interest  if  the  fire  is  aimed  at  tar- 
gets located  below  the  gun  level.)  At  the  same  time,  the  trajec- 


9 - 1634 
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y 


Fig.  31 


tory  cannot  cross  the  boundary  but  must  touch  it.  If  a trajectory 
crossed  the  boundary  it  would  go  outside  the  limits  of  the  zone, 
but  this  runs  counter  to  the  fact  that  the  boundary  separates  the 
effective  zone  from  the  safe  zone. 

The  boundary  of  the  region  filled  with  the  family  of  curves 
touching  this  boundary  is  called  the  envelope  of  the  family  of  curves. 
Let  us  find  the  equation  of  the  envelope  of  the  family  of  trajectories. 
To  do  this,  draw  a vertical  line  AAt  and  locate  the  point  B at  which 
the  vertical  line  cuts  the  envelope.  We  have  taken  a definite  value 
of  x by  drawing  this  vertical  line.  The  point  B corresponds  to  the 
greatest  height  y at  which  a shell  can  be  in  covering  the  horizontal 
distance  x for  any  angle  of  departure  9.  And  so  we  have  to  find 
the  maximum  of  y(<p)  for  a given  fixed  x.  We  get  the  condition 

fr(*-  T)  = 0 (20) 

dtp  x 

Condition  (20)  gives  us  an  equation  that  connects  x and  9. 
For  every  value  of  x there  is  a value  of  9 defined  by  equation 
f(20)  so  that  we  get  9 = <p(x).  Substituting  9 = y(x)  into  the  equa- 
tion of  the  family  (19),  we  find  the  equation  of  the  envelope. 

Let  us  carry  out  the  necessary  manipulations.  For  the  sake 
of  convenience  we  introduce  the  variable  0 = tan  9 in  place  of  the 

variable  9.  Noting  that  by  a familiar  formula  of  trigonometry  — — = 

cos2  9 

= tan2  9 -f-  1 = 02  + 1,  we  rewrite  the  equation  (19)  as 

y = Qx  - (02  + I)  *2 

2v* 

y 2 

Finally,  introducing  the  notation  — = / (as  will  be  seen  from  (22), 

& 

1 is  the  maximum  horizontal  range  of  fire),  we  get 

y = 0X-*l(  02+1) 

3-y  = x-x-o 

30  l 


whence 


(21) 
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The  condition  (20)’  gives  0 = —.  Substituting  this  value  into  (21), 

X 

we  get  the  equation  of  the  envelope. 


We  advise  the  reader  to  carry  through  the  entire  derivation  without 
passing  to  the  variable  0. 

Exercise 

A shell  leaves  a gun  with  the  velocity  v0  = 100  m/s.  Can  we 
hit  a target  at  a horizontal  distance  of  500  metres  from 
the  gun  and  at  a height  of  300  metres  ? at  500  metres  ? 

4.6  Taylor’s  series  and  extremum  problems 

Taylor's  power  series,  which  is  well  known  for  functions  of  one 
variable,  can  also  be  useful  for  functions  of  many  variables.  Recall 
(see,  for  example,  HM,  Sec.  3.17)  that  for  functions  of  one  variable 
this  series  is  of  the  form 

f(x)  = /(a)  +%&  (x  -a)+f-^(X-  a)*  + ^ (x  - «)»  + ...  (23) 

Now  let  us  consider  a function  f(x , y)  of  two  variables.  Let  x 
be  close  to  a constant  value  a , and  y to  a constant  value  b.  Then 
the  crudest  approximation,  the  formula  of  the  zero-order  approxima- 
tion that  does  not  take  into  account  changes  in  x and  y,  is  of  the 
form 

/(*>  y ) = /(«,  b) 

A more  exact  first-order  approximation  that  takes  into  account 
first-order  terms  is 

f{x>  y)  =f{a>  b)  + A(x  — a)  + B(y  — b)  (24) 

where  A and  B are  certain  numerical  coefficients.  To  find  them, 
we  take  the  partial  derivatives  of  both  sides  of  (24)  with  respect 
to  x: 

f’(x,  y)=  A 

(the  first  and  third  terms  in  the  right-hand  member  of  (24)  do  not 
depend  on  x,  and  so  their  partial  derivatives  with  respect  to  x are 
equal  to  zero).  Since  the  coefficient  A must  be  a constant,  we  get 


3 y dy  3©  3 v . 3 v 

In  making  the  substitution  <p  = <p(0)  we  get  — = — . * — so  that  — — • 

30  3(p  30  30  3<p 

dy  dy 

However,  the  condition  — = 0 coincides  with  the  condition  — =0. 


30 


39 
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A = /'(«,  b).  In  similar  fashion,  differentiating  (24)  with  respect  to 
v,  we  find  B = f'y(a,  b),  that  is  to  say,  (24)  is  indeed  of  the  form 

f(x,  y)  = f(a,  b)  + f’x(a,  b)  (x  - a)  + fy(a,  b)(y  - b)  (25) 

(Derive  this  formula  in  straightforward  fashion  from  the  results 
of  Sec.  4.1.) 

The  more  exact  formula  of  the  "second  approximation”,  which 
takes  into  account  second-order  terms  as  well,  is  of  the  form 

/(*»  y)  = f(a>  b)  + [A  {x  - a)  + B(y  - 6)] 

+ [C(x  - ay  + D(x  - a)  (y  - b)  + E(y  - b)*] 

where  A,  B,  C,  D,  E are  numerical  coefficients.  To  find  them,  we 
compute  the  first-  and  second-order  partial  derivatives : 

fx(x,  y)  = A + 2 C(x  — a)  + D(y  — b), 

/;(*,  y)  =B  + D(x- a)  + 2 E (y  - b), 
fx'x(x,  y)  = 2C,  f’x'y{x,  y ) = f'y'x{x,  y)  = D,  f'yy{x,  y)  = 2E 
Putting  x = a,  y = b in  these  equations,  we  get 

A=f'x(a,b),  B =f'y{a,  b),  C = ±f'x'x{a,  b), 

D=f’x'y(a,b),  E = !/”(«,  b) 

Substituting  these  values  into  (26),  we  obtain  the  final  formula  for 
the  second  approximation: 

f(x,  y)  =/(a,  b)  + [/:(«,  b)  (x  — a)  +f'y(a,  b)  (y  — b)] 

+ {U'x'M,  b)  (x  - ay  + 2 />,  b)  (x  — a)  (y  — b) 

+ />.  b)  (y~by]  (27) 

Although  we  derived  this  formula  independently  of  (25),  we  see 
that  the  right  side  has  the  same  linear  part  as  in  (25)  and,  besides, 
second-order  corrections  have  appeared.  We  leave  it  to  the  reader 
to  derive,  in  similar  fashion,  the  formula  for  the  third  approximation 
(incidentally,  this  formula  is  rarely  used) ; added  third-order  correc- 
tions will  be  on  the  right  side  of  (27). 

(As  a check  we  will  assume  that  the  function  / does  not  depend 
on  y;  then  in  the  right  members  of  (25)  and  (27)  all  the  partial 
derivatives  in  which  differentiation  with  respect  to  y occurs  will 
drop  out  and  we  will  get  the  partial  sums  of  the  series  (23)  for  a 
function  of  one  variable.) 

As  in  the  case  of  functions  of  one  variable,  the  formulas  (25)  and 
(27)  are  most  convenient  if  | x — a \ and  | y — b | are  very  small. 
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Fig.  32 


But  if  these  differences  are  great,  the  formulas  fail  to  hold  and  can 
lead  to  erroneous  conclusions.  Formulas  for  functions  of  more  than 
two  independent  variables  are  similar  in  form  but  we  will  not  dwell 
on  them. 

The  formulas  just  obtained  can  be  used,  say,  for  investigating 
extremum  points  of  the  function  f(x , y).  Suppose  that  this  function 
has  an  extremum  (maximum  or  minimum)  at  x = a,  y = b.  The 
rough  arrangement  of  the  level  lines  of  the  function  / in  the  plane 
of  the  arguments  near  the  extremum  point  M is  shown  in  Fig.  32. 
It  is  then  clear  that  if  we  put  y = b and  change  x alone,  then  the 
resulting  function  f(x,  b)  of  one  variable  x has  an  extremum  at 
x — a.  Geometrically,  this  means  that  if  we  go  along  the  straight 
dashed  line  in  Fig.  32,  we  get  an  extremum  at  M.  But  then,  as 
we  know  from  the  calculus  of  functions  of  one  variable,  the  derivative 
at  the  extremum  point  must  be  zero: 


In  the  square  brackets  we  have  the  derivative  with  respect  to  x 
with  fixed  y = bt  that  is,  a partial  derivative  with  respect  to  x . 
In  similar  fashion  we  consider  the  case  of  a fixed  x = a and  a vary- 
ing y.  We  arrive  at  the  necessary  conditions  for  the  extremum  of  a 
function  of  two  variables: 

/'*(*,  b)  = 0,  />,  b)  = 0 (28) 

(We  made  use  of  similar  reasoning  in  Sec.  2.3  when  we  considered 
the  method  of  least  squares.) 

As  we  know,  for  a function  f(x)  of  one  variable  the  necessary 
condition  /'(«)  = 0 of  an  extremum  is  "almost  sufficient":  for 
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example,  if  f"(a)  =£  0,  then  there  must  be  an  extremum  at  the  point 
x = a,  that  is,  a maximum  for  f"(a)  < 0,  and  a minimum  for  f"(a)  >0. 
It  might  be  expected  that  also  in  the  case  of  a function  of  two  va- 
riables there  must  be  an  extremum  at  the  point  (a,  b),  provided 
condition  (28)  holds  if  at  that  point  the  second-order  partial  deriva- 
tives are  different  from  zero.  But  this  is  not  so,  the  sufficient  condi- 
tion turns  out  to  be  more  complicated. 

Let  us  first  consider  some  examples.  Let 

z=fix>y)  = *2  + y2 

The  condition  (28)  yields  2x  = 0,  2 y — 0,  that  is,  the  suspected 
extremum  is  the  origin  of  coordinates.  Indeed,  here  at  the  origin 
we  have  a minimum  since  z = 0 and  at  other  points  z > 0.  As  for 
the  second-order  partial  derivatives,  they  are  constant  in  this  exam- 
ple, and /''  = 2,/''  = 0,/"  = 2.  In  similar  fashion  it  is  easy  to  verify 
that  the  function 

z = Ax2  + Cy2  (29) 

has  a minimum  at  the  origin  for  A > 0,  C > 0 and  a maximum  for 
A < 0,  C < 0. 

A completely  new  picture  emerges  if  in  (29)  A and  C are  of 
different  sign,  for  instance,  if  we  consider  the  function 

z = x2  — y2 

The  appropriate  level  lines  are  shown  in  Fig.  33.  The  portion  of  the 
plane  where  z > 0 is  shown  hatched  in  the  figure ; the  small  arrows 
indicate  the  direction  of  fall  of  the  level  (on  a map,  this  would  mean 
the  direction  of  water  flow).  For  y — 0 we  get  z = x 2,  that  is,  the 
function  increases  in  both  directions  from  the  origin  along  the 
*-axis  and  has  a minimum  at  the  origin  itself.  But  if  x = 0,  then 
z = — y2,  that  is,  the  function  decreases  in  both  directions  along 
the  y-axis  and  has  a maximum  at  the  origin.  If  we  consider  other 
straight  lines  passing  through  the  origin,  the  function  has  a maxi- 
mum at  the  origin  of  some  lines  and  a minimum  at  the  origin  of 
others.  This  case  is  called  a minimax , and  there  is  no  extremum  at 
the  origin,  although  the  necessary  conditions  (28)  hold  and  the  se- 
cond-order partial  derivatives  are  not  all  zero.  In  geography,  a 
minimax  is  called  a saddle ; it  is  observed,  for  instance,  at  the  high- 
est point  of  a mountain  pass.  Another  example  is  a simple  horse 
saddle.  For  example,  take  a saddle  corresponding  to  Fig.  33;  one 
can  sit  on  it  hanging  his  legs  along  the  y-axis  and  facing  the  #-axis. 
Then  in  front  and  behind  (along  the  #-axis)  the  saddle  rises. 

We  have  a similar  minimax  for  the  function  z = xy  at  the  ori- 
gin; the  corresponding  pattern  of  level  lines  is  shown  in  Fig.  26. 
(Where  is  z > 0 and  z < 0 in  Fig.  26?) 
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Now  let  us  examine  the  general  case.  If  the  necessary  conditions 
(28)  are  fulfilled,  then  formula  (27)  becomes 

f(x,  y)  — /{a,  b)  + | [/"(«,  b)  (x-a)* 

+ 2 /”(«,  b)  (x  - a)  (y-  b)  + /"(«,  b ) (y  - b)*}  (30) 

In  deriving  this  formula,  we  of  course  disregard  third-order  terms, 
but  when  investigating  the  extremum  we  must  be  near  the  values 
x = a,  y = b,  where  these  terms  are  less  essential  than  the  second-or- 
der terms  that  have  been  written  down.  For  brevity  put 

/”(«,  b)=  A,  f'x'y{a,  b)  = B,  fyy{a,  b ) = C, 

x — a = \,  y — b = 7j 

Then  formula  (30)  shows  that  everything  depends  on  the  behaviour 
of  the  quadratic  form  (this  is  a homogeneous  polynomial  of  second 
degree)  P(^,  •/))  = A £2  -f-  2B\t\  -f  Ct)2  since,  approximately,  near 

the  values  x = a,  y — b,  f(x,  y)  = f(a,  b)  + -i-  P(%,  vj).  If  it  is 

positive  for  all  £,  rj  (for  example,  if  it  is  of  the  form  £2  + vj2), 
then  f(x,  y)  > f(a,  b)  near  the  point  (a,  b)  and  thus  the  function  / 
has  a minimum  at  this  point.  If  this  form  is  negative,  then  / has  a 
maximum  at  (a,  b).  But  if  the  form  can  assume  values  of  both  signs 
(say,  if  it  is  of  the  form  £2  — r,2),  then  there  will  be  a minimax 
at  the  point  (a,  b),  in  other  words  there  will  not  be  any  extremum. 
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B 


How  do  we  know,  from  the  coefficients,  A,  B,  C,  which  of 
these  cases  occurs?  Write 

P{1,  is)  = |I)2  + 2B  A + C]  = y f(At*  + 2Bt  + C)  (3 1) 

where  t stands  for  \ frr  From  elementary  mathematics  we  know 
that  if  the  discriminant 


B2  - AC  > 0 (32) 

the  polynomial  in  brackets  has  two  real  zeros  where  it  changes  sign. 
Hence  this  is  a minimax  case.  But  if 

B2  — AC  < 0 (33) 

then  the  indicated  polynomial  has  imaginary  zeros  and  so  there  is 
no  change  of  sign  (a  polynomial  can  change  sign  only  when  it  passes 
through  a zero).  Hence  this  is  an  extremum  case.  To  find  out  what 
sign  the  right  member  of  (31)  has,  put  / = 0.  We  see  that  if  in  addi- 
tion to  (33)  we  have  C > 0,  then  the  right  side  of  (31)  is  positive 
for  all  t and  therefore,  by  virtue  of  the  preceding  paragraph,  the 
function  / has  a minimum  at  the  point  (a,  b).  If  in  addition  to  (33) 
we  have  C < 0,  then  / has  a maximum.  (Verify  all  these  con- 
ditions using  the  examples  discussed  above.) 

Minimax  points  are  of  great  importance  in  the  solution  of  an 
important  class  of  problems  which  we  will  illustrate  with  a vivid 
example.  Suppose  we  have  two  villages  A and  B in  a region  of 
rolling  country  with  the  level  lines  as  indicated  in  Fig.  34.  It  is  required 
to  construct  a road  between  them.  This  can  be  done  in  a variety  of 
ways,  several  of  which  are  shown  by  dashed  lines  in  Fig.  34.  (For 
pictorialness,  arrows  have  been  added  to  indicate  lines  of  water  flow.) 
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Taking  any  road  (/)  from  A to  B,  we  have  to  go  uphill  to  the  point 
indicated  in  Fig.  34  and  then  downhill.  If  the  uphill  portion  is  dif- 
ficult for  transport,  the  natural  thing  is  to  require  that  it  be  a minimum. 
To  be  more  precise  in  stating  this  requirement,  we  denote  by  z(M)  the 
altitude  at  any  point  M on  the  map.  Then  the  uphill  portion  along 
the  line  (/)  is  equal  to  max  z{M)  — z(i4),  where  max  z(M)  stands 

M on  (/)  M on  (/) 

for  the  greatest  value  of  altitude  z on  the  line  (/),  But  the  value 
of  z(A)  for  all  lines  being  compared  is  the  same;  thus,  from  among 
all  lines  (/)  joining  A and  B it  is  required  to  find  that  line  along  which 
max  z(M ) is  a minimum,  that  is  to  say,  the  line  that  realizes 

M cm  (/) 

min  max  z{M).  It  is  clear  that  the  desired  line  passes  through  the 

(/)  M on  (/) 

point  C of  the  mountain  pass  which  is  precisely  the  minimax  point 
of  the  function  z(M).  True,  there  is  another  such  point  D , but  it  is 
higher  and  therefore  worse.  (Verify  the  fact  that  if  the  level  lines  in 
Fig.  34  are  drawn  every  100  metres,  the  intermediate  rise  in  altitude 
along  the  roads  indicated  in  the  drawing  is,  respectively,  500,  600, 
300,  200,  and  400  metres.) 

The  difference  between  the  cases  (32)  and  (33)  serves  as  a basis 
for  a classification  of  points  on  an  arbitrary  surf  ace  in  space.  Suppose 
we  have  a surface  (S)  and  any  point  N on  that  surface.  We  choose 
a system  of  coordinates  so  that  the  #-axis  and  y-axis  are  parallel 
to  the  plane  ( P ) tangent  to  (S)  at  the  point  N.  Then,  near  N,  the 
surface  (S)  may  be  represented  by  the  equation  z = f(x,  y),  and, 
because  of  the  choice  of  axes,  equations  (28)  hold  at  point  N ; here, 
a and  b are  the  values  of  the  coordinates  x and  y at  the  point  N so 
that  N = (a,  b,  f (a,  b )).  Then,  depending  on  whether  inequality  (33) 
or  (32)  holds,  N is  called  an  elliptic  or  hyperbolic  point  of  the  surface 
(S).  In  the  former  case,  the  surface  (S)  near  N is  convex  and  is  located 
to  one  side  of  (P).  In  the  latter  case,  the  surface  (S)  near  N is  saddle- 
like  and  is  located  on  both  sides  of  (P) ; the  tangent  plane  (P)  inter- 
sects (S)  in  two  lines  that  intersect  in  the  point  N. 

It  can  be  demonstrated  that  the  conditions  (32)  and  (33)  are 
not  violated  under  a rotation  of  the  coordinate  axes  in  space. 
Therefore  if  the  equation  of  the  surface  (S)  is  given  as  z = f(xt  y), 
then  in  order  to  determine  the  type  of  some  point  of  the  surface 
there  is  no  need  to  choose  new  axes  in  order  to  satisfy  the  condition 
(28).  One  merely  has  to  verify  that  the  conditions  (32)  or  (33)  hold  in 
the  original  system  of  coordinates  x,  y,  z. 

There  are  surfaces  (spheres,  ellipsoids,  paraboloids  of  revolution, 
etc.)  where  all  points  are  elliptic;  such  surfaces  are  convex  in  the 
large.  There  are  surfaces,  for  instance,  the  surface  with  equation 
z = x2  — y2  that  corresponds  to  Fig.  33,  in  which  all  points  are 
hyperbolic.  On  the  other  hand,  there  are  surfaces  with  points  of  both 
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kinds.  The  surface  of  a torus  (a  doughnut  of  ideal  shape)  is  one  such 
example.  Here,  the  portions  filled  with  elliptic  points  are  separated 
from  the  portions  filled  with  hyperbolic  points  by  lines  at  the  points 
of  which  B2  — AC  — 0.  These  are  called  parabolic  points.  On  the 
torus  these  are  points  at  which  the  tangent  plane  is . perpendicular 
to  the  axis  of  the  torus;  they  fill  two  circles.  (Try  to  figure  out  where 
on  your  own  body  there  are  lines  of  parabolic  points).  Incidentally 
there  are  surfaces  — cylindrical  or  conic  — that  are  completely 
filled  with  parabolic  points. 

It  is  clear  that  the  type  of  point  of  surface  does  not  change 
under  any  translation  of  the  surface  in  space,  that  is,  the  indicated 
classification  of  the  points  of  the  surface  is  geometrically  invariant. 
In  contrast,  the  notion  of  a highest  or  lowest  point  of  a surface,  which 
is  essential  in  the  study  of  extrema,  is  not  invariantly  linked  to  the 
surface  itself,  since  the  highest  point  ceases  to  be  such  under  a rota- 
tion of  the  surface. 

Similarly,  the  points  of  the  graph  ( L ) of  a function  y — <?(x) 
corresponding  to  its  extremal  values  are  not  connected  invariantly 
with  the  line  (unlike,  say,  the  inflection  points  of  this  line).  The  points 

of  the  line  with  equation  f(x , y)  = 0,  in  which  — = 0 or  — =oo, 

dx  dx 

are  not  invariant  either. 

The  foregoing  reasoning  can  be  applied  to  a study  of  the  struc- 
ture of  a line  ( L ) with  the  equation  f(x,  y)  = 0 on  an  xy- plane  in  the 
vicinity  of  a point  N(a,  b)  of  that  line.  In  this  connection,  it  is  useful 
to  consider  the  auxiliary  surface  (S)  with  the  equation  z = f(x>  y) 
so  that  (L)  results  from  an  intersection  of  (S)  with  the  (P)  plane: 
z = 0.  Here,  the  line  ( L ) turns  out  to  be  included  in  the  family  of 
curves  with  equation  f(x , y)  = C for  different  values  of  C,  that  is 
to  say,  level  lines  of  the  function  / resulting  from  the  intersection  of 
(S)  with  the  planes  z = C parallel  to  (P).  If  /'(<z,  b)  ^ 0 or  f'y(a,  b)  0 
— in  that  case  N is  called  an  ordinary  point  of  the  line  ( L ) — then 
(P)  is  not  tangent  to  (5)  at  the  point  (a,  b,  0)  and  so  (L)  has  the  form 
of  a smooth  arc  near  N.  But  if  f'x  (a,  b)  = f'y  ( at  b)  = 0,  then  N is 
called  a singular  point  of  the  line  (L)  and  (L)  consists  of  points  com- 
mon to  the  surface  (S)  and  the  plane  (P)  tangent  to  (S).  From  the 
foregoing  it  follows  that  if  inequality  (33)  holds,  then  N is  an  isolated 
point  of  the  line  (L),  that  is,  there  are  no  other  points  of  (P)  in  the 
vicinity  of  AT.  If  the  inequality  (32)  holds  true,  then  N is  a point  of 
self-intersection  of  (P),  that  is,  in  some  vicinity  of  N the  curve  (L) 
consists  of  two  arcs  intersecting  at  N.  But  if  B2  — AC  = 0,  then 
the  structure  of  (L)  in  the  vicinity  of  N can  be  substantially  more 
intricate.  (To  illustrate,  pass  to  polar  coordinates  and  construct  the 
curves  x2  — y2  — \x2  + y2)2  and  x2  + y2  = ( x 2 + y2)2.  What  is 
the  coordinate  origin  for  each  of  them?) 
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Exercise 

Test  the  following  functions  for  extrema: 

(a)  f(x,  y)  = *2  — xy  + jy2  — 2x  + Ay  — 1 ; 

(b)  f(x,  y)  = x 2 + 3 xy  — 2y  + 2 ; 

(c)  f(x>  y>  z)  = *2  + y2  — z2  + 2s. 

4,7  Multiple  integrals 

We  start  with  an  example.  Consider  a solid  (£2)  whose  density 
p g/cm3  is  known  and  is  nonhomogeneous ; that  is,  the  density  differs 
at  different  points.  Suppose  we  want  to  compute  the  mass  m of  the 
solid.  A similar  problem  for  a linear  distribution  of  mass  is  known 
to  be  solved  with  the  aid  of  ordinary  integrals : if  the  mass  is  distri- 
buted along  a line  segment  a , b with  linear  density  a g/cm,  then 

b 


a 


The  spatial  case  is  studied  in  exactly  the  same  way  as  the 
linear  case.  We  partition  (mentally)  (Q)  into  subvolumes  (AQj), 
(AQ2),  ...,  (AQJ  and  choose  in  each  of  them  a point:  Mlt  Mz , 

(Fig.  35  depicts  a solid  with  subvolumes  numbered  in  arbitrary  fashion). 
If  the  subvolumes  are  small  enough,  we  can  regard  the  density  in 
each  as  being  constant.  Then  the  mass  of  the  first  subvolume 

can  be  computed  as  the  product  of  the  density  by  the  numerical 
value  of  the  volume,  or  p(Mx)  AQ1#  the  mass  of  the  second  subvolume 
is  found  in  the  same  way,  and  so  on.  We  then  have 

w(Q)  « P (A/,)  An,  + p(M2)  AO,  + ...  + P(M„)  Ml„  =£  ?(Mk)  AQfc 

k = 1 

where  AQ*  is  to  be  understood  as  the  numerical  value  of  the  volume 
(AQJ  (in  cm3). 

This  is  an  approximate  equation  since  the  density  inside  each 
sub  volume  is  not  precisely  constant.  However,  the  finer  the  partition, 
the  more  exact  it  is,  and  in  the  limit,  when  the  fineness  of  the  partition 
becomes  infinite,  we  obtain  the  exact  equation 

»«( O)  = lim  it,  ?{M *)  AQfc  (34) 

k=l 

Arguing  in  similar  fashion,  we  can  conclude  that  if  a charge  with 
density  a is  distributed  in  a solid  (Q),  then  the  charge  q can  be  com- 
puted from  the  formula 

« 

qi n)  = lim  Ys  °(Mic)  AQ* 

^ = 1 

with  the  designations  having  the  same  meaning. 


(35) 
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The  uniformity  of  formulas  (34)  and  (35)  serves  as  a basis  for  the 
general  definition  of  the  concept  of  a volume  integral. 

Suppose  we  have  a finite  portion  of  space  or,  as  we  more  commonly 
say,  a finite  region  (£2)  and  in  it  (that  is  to  say,  at  every  point  M)  a 
function  u = /(M)  that  takes  on  finite  values.  (In  our  first  example  the 
region  was  a solid  (£2)  and  the  function  was  its  density,  which  at  every 
point  assumed  a value.)  To  form  the  integral  sum,  we  partition  the 
region  (£2)  into  subregions  (A£2X),  (A£22), (A£2n)  andineach  subregion 
we  take  an  arbitrary  point,  Mv  M2f Mnf  and  then  form  the  sum 

E%AQ,  = E/(Mt)A£lfc  (36) 

k=l  k=l 

where  A£2fc  stands  for  the  numerical  value  of  a subregion  (A£2J. 
This  sum  has  a high  degree  of  arbitrariness : its  value  depends  both  on 
the  mode  of  partition  of  the  region  (£2)  into  subregions  (A£2J  and  also 
on  the  choice  of  points  Mk  inside  each  subregion.  However,  with 
increasing  fineness  of  partition  of  the  region  (£2)  the  arbitrary  nature 
hardly  affects  the  sum  and,  in  the  limit,  ceases  to  affect  it  altogeth- 
er. The  limit  of  the  sum  for  an  infinitely  fine  partition  of  the  region  (£2) 
is  called  the  [volume)  integral  of  the  function  / over  the  region  (£2) : 

j u dCl  = j f(M)  dCl  = limX)  f(Mk)  Aft*  (37) 

(£1)  (£1) 

Now  the  formulas  (34)  and  (35)  may  be  written  thus: 


£ p dQ., 


= Urfft 
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The  product  p dCl,  which  corresponds  to  an  infinitely  small  vo- 
lume (‘'element  of  volume”)  (d£i)f  is  called  the  element  (or  differential }• 
of  mass  and  is  denoted 

dm  — pd£l  (38); 

where  p is  the  density  at  any  one  of  the  points  (dQ).  In  setting  up 
this  expression,  we  can  drop  higher-order  infinitesimals  (which  result 
from  the  fact  that  the  density  even  in  a small  volume  is  variable), 
so  that  it  is  directly  proportional  to  the  volume  d£l.  Summing  all 
elements  (38)  over  the  volume  (£1),  we  get  the  total  mass 

;n( q)  = ^dm  — ^ p rf£2 
{«)  (h) 

The  basic  properties  of  volume  integrals  are  similar  to  the  corres- 
ponding properties  of  "simple”  integrals. 

The  integral  of  a sum  is  equal  to  the  sum  of  the  integrals,  that  is, 

j*  («i  ± Uo)  di 2 = ^ u1  d£l  dr  ^ u2 

m (O)  (O) 

A constant  factor  may  be  taken  outside  the  integral  sign: 

^ C-udCl  = C ^ u dCl  (C  constant) 

(H)  (Cl) 

For  any  partition  of  the  region  (£2)  into  parts,  say  (£2J  and 
(£22),  it  will  be  true  that 

^ u dQ  = ^ u dQ  + ^ u rf£2 

(H)  (O,)  (Clt) 

The  integral  of  unity  is  equal  to  the  volume  of  the  region  of  inte- 
gration : 

^d£l  = £2 

(Cl) 

If  the  variables  at  hand  are  dimensional,  the  dimensionality  of 
the  integral  is  equal  to  the  product  of  the  dimensions  of  the  integrand 
into  the  dimensions  of  the  volume: 


^ udL 2 

-<n) 


= [«]-[£ 2] 


where  the  square  brackets  denote  the  dimensions  of  a quantity. 
Example:  [p]  = g cm"3;  |^p  <2£2j  = g cm"3  cm3  = g = [mf 
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The  mean  value  (“integral  mean”,  “arithmetical  mean”)  of  the 
Junction  u(M)  over  the  region  (Q)  is  introduced  as  a constant  ut  the 
integral  of  which  over  the  region  (£2)  is  equal  to  the  integral  of  the 
function  u over  this  region.  Thus 


whence 


u dQ. 


\ u dQ  = uQ  and  u = — { u dQ 

J n ) 

(O)  (Q) 


As  in  the  case  of  “simple”  integrals,  we  derive  min  u ^ u ^ max  w. 

(O)  (Q) 

At  the  beginning  of  this  section  we  regarded  the  concept  of  density 
as  straightforward  and  clear  and  expressed  the  mass  in  terms  of  the 
density  with  the  aid  of  integration.  Conversely,  we  can  proceed  from 
the  mass  and  express  the  density  in  terms  of  the  mass.  To  do  this  we 
have  to  form  the  ratio 

__  m(AQ) 


of  the  mass  corresponding  to  a small  region  (A  £2)  to  the  volume  of  this 
region.  This  ratio  is  the  mean  density  in  the  region  (A£2).  To  obtain 
the  density  at  a point  M,  we  have  to  infinitely  shrink  (A£2)  to  this 
point  and  take  the  limit  of  the  indicated  ratio: 


p(M)  = lim 


(AQ) 


(AQ)^M  A£2 


dm 

dQ. 


(39). 


This  process  is  similar  to  differentiation. 

If  one  takes  into  account  the  discrete  molecular  structure  of  mat- 
ter, then  in  (39)  the  volume  (AO)  cannot  be  infinetely  shrunk  to  a 
point  even  in  one's  imagination.  Instead  of  this  formula  we  have  to 
write 


P(M)  = 


m(Afl) 

An 


where  (AO)  is  a “practically  infinitesimal  region”  containing  the  point 
M , that  is,  a region  sufficiently  small  relative  to  macroscopic  bodies  and 
at  the  same  time  sufficiently  large  if  compared  with  molecular  dimen- 
sions. Here  we  pass,  as  it  were,  from  the  discrete  picture  of  a material 
body  to  its  continuous  model,  the  density  of  which  is  obtained  by  aver- 
aging, that  is,  via  a computation  of  the  mean  density  of  the  original 
picture  over  volumes  of  the  indicated  “practically  infinitesimal” 
dimensions. 

From  now  on,  when  we  consider  continuous  media,  we  will  abstract 
from  the  discrete  structure  of  matter  and  assume  that  such  a transition 
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to  the  continuous  model  of  the  medium  and  its  density  has  already 
been  carried  out. 

It  may  turn  out  that  one  of  the  dimensions  of  a portion  of  space 
occupied  by  a mass  or  charge  is  substantially  less  than  the  other  two 
dimensions  (say  the  thickness  is  much  less  than  the  length  and  width). 
We  then  assume  that  the  mass  or  charge  is  distributed  over  the  surface. 
Similarly,  if  two  dimensions  of  this  portion  are  appreciably  less  than 
the  third,  then  we  assume  that  the  mass  or  charge  is  distributed  along 
a line.  In  that  case,  formulas  (34)  and  (35)  remain  valid  if  by  density 
we  have  in  mind  the  surface  density  (referred  to  a unit  surface)  or 
linear  density  (referred  to  a unit  length)  and  by  A£2*  the  area  or  length 
of  the  subregion  (A £2*),  respectively.  In  the  general  case  we  say  that 
A£2jt  is  the  measure  of  the  region  (A£2ft),  and  is  to  be  understood  as  the 
volume,  area,  or  length,  depending  on  whether  we  are  discussing 
volume,  surface  or  linear  regions. 

The  definition  of  an  integral  over  a surface  (plane  or  curved) 
and  also  of  an  integral  along  a line  is  given  in  the  same  way  as  the 
volume  integral,  that  is,  via  formula  (37).  Naturally,  in  place  of  the 
volume  of  a subportion  one  has  to  take  the  area  or  the  length.  Volume 
and  surface  integrals  are  called  multiple  integrals  for  reasons  that  will 
emerge  later,  surface  integrals  being  termed  double  integrals , and  vo- 
lume integrals,  triple  integrals . The  integral  along  a line  is  called  a line 
integral  (actually  it  is  a curvilinear  integral).  The  properties  of  all  these 
integrals  are  completely  analogous  and  will  be  widely  used  in  this  text 
in  Chs.  10  and  1 1,  where  we  discuss  the  theory  of  vector  fields. 

Multiple  integrals  are  calculated  with  the  aid  of  ordinary  integrals. 
Let  us  consider  a double  integral,  for  example 

I — ^u  ^£2 

(H) 

taken  over  the  region  (£2)  in  a plane.  To  go  over  to  ordinary  integrals, 
we  have  to  introduce  coordinates,  say  the  Cartesian  coordinates  x , y 
(Fig.  36).  Then  u may  be  regarded  as  a function  of  x,  y and  we  have 

I = ^ u(x,  y)  d£2 
(«) 

Since  in  defining  an  integral  we  can  partition  the  region  (£2)  in  arbi- 
trary fashion,  we  can  take  a partition  by  the  straight  lines  x = constant 
and  y = constant,  which  fits  integration  in  terms  of  Cartesian  coordi- 
nates. Then  all  subregions  (A £2)  will  be  in  the  form  of  rectangles  with 
sides  dx  and  dy,  so  that  d£2  = dx  dy , or 

I = ^ u{x , y)  dx  dy 
<n> 
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dQ=dxdy 


Here  we  have  two  integral  signs  since  the  summation  is  naturally  carried 
out  in  two  stages:  for  instance,  we  first  sum  over  all  rectangles  of  a 
column  corresponding  to  a certain  fixed  x and  then  we  sum  the  results 
over  all  x.  The  first  summation  yields 

Vii*)  rVii*)  \ 

^ u(x,  y)  dx  dy  = J ^ u(x,  y)  dyjdx 

vi(*)  'y  i(*)  ' 

where  y = yi(x)  and  y = y2(%)  (see  Fig.  36)  are  the  coordinates  of 
the  lower  and  upper  points  of  the  region  (£1)  for  a given  x. 

After  the  second  summation  we  finally  get 

b /y2(*)  \ 

I = U ^ u{x,  y)  dy\dx 

a ' y,(*)  j 

or,  as  it  is  more  common  to  write, 

b y,(*) 

I = ^dx  Jj  u(x,  y)  dy  (40) 

Thus,  computing  a double  integral  reduces  to  twofold  ordinary  inte- 
gration. First  the  inner  integral  is  evaluated: 

y*(*) 

^ u(x,  y)  dy 

vM 

After  integration  and  substitution  of  the  limits  we  get  a result  that, 
generally,  depends  on  x (since  x was  held  constant  in  the  process  of 
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Fig.  37 


integration),  that  is,  we  have  a certain  function  f(x)  that  must  be  inte- 
grated with  respect  to  x to  yield  the  final  result: 

b 

I = j/(»)  dx 

a 

The  integration  could  have  been  performed  in  the  reverse  order : 
first  {inner)  with  respect  to  x,  and  then  {outer)  with  respect  to  y. 
The  result  is  the  same,  although  in  a practical  evaluation  one  method 
may  prove  to  be  more  involved  than  the  other. 

It  is  simplest  to  set  the  limits  of  integration  if  the  region  (£}) 
is  a rectangle  with  sides  parallel  to  the  coordinate  axes:  then  not 
only  the  limits  of  the  outer  integral  but  also  those  of  the  inner  integral 
will  be  constant.  And  if,  besides,  the  integrand  is  a product  of  a function 
dependent  on  x alone  by  a function  dependent  on  y alone,  then  the 
entire  double  integral  can  be  decomposed  into  a product  of  two  ordi- 
nary (single)  integrals: 

b d b fd  \ 

{dx{  [/(*)  <p(y)]  dy  = J/(*)  dx]\  <p(y)  dy\ 

a c a 'c  ' 

d b 

= ^ ?(}')  dy  • {f(x)  dx  (41) 

c a 

By  way  of  an  illustration  let  us  evaluate  the  mass  of  a flat 
lamina  shown  in  Fig.  37,  the  surface  density  of  winch  varies  by 
the  law 

° = Po(h  + M (42) 

where  a,  (3  are  certain  constant  coefficients.  This  law  is  valid  if  the 
lamina  is  homogeneous  but  bounded  from  below  by  the  *y-plane  and 
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above  by  a plane  with  equation  z = h + ax  + (3y  oblique  to  the  first 
plane.  Then  the  surface  density  at  any  point  ( x,  y)  is  obtained  by  multi- 
plying the  density  p0  of  the  material  of  the  lamina  by  the  altitude  z of 
the  lamina  at  the  given  point,  which  means  we  arrive  at  formula  (42). 
(At  the  same  time  we  see  that  the  mass  distributed  over  the  plane  can 
be  obtained  merely  by  projecting  onto  this  plane  the  mass  distributed 
over  the  volume.) 

By  formula  (40)  we  get 


i*  u 

^ cr  d£l  = a(x,  y)  dx  dy  dx^  p 0(h  + ax  + |3y)  dy 


The  inner  integration  is  performed  with  respect  to  y for  x fixed,  that 
is  to  say,  along  the  parallel  strips  shown  in  Fig.  37.  Evaluate  the  inner 
integral: 

b 

* = J Po(h  + **  + fty)  dy  = po  [(*  + «*)  y + y ]*_o 

0 


— Po|(^  + *x)  b + ~"J 

It  now  remains  to  take  this  result  which  depends  on  x and  integrate 
it  with  respect  to  x: 

a a 

m = ^ t(x)  dx  = ^ p0^(A  + ax)  b + j dx 
o o 


=Po(fc +<*7+^  p.(«m 

The  physical  meaning  of  these  computations  consists  in  our  projecting, 
as  it  were,  the  lamina  on  the  x-axis  to  obtain  the  mass  distributed  along 
a line,  namely  a material  segment  with  linear  density 

b 

-c(x)  = ^ a(x,  y)  dy 


(this  is  the  inner  integral).  The  meaning  of  linear  density  is  immediately 
apparent  here:  it  is  the  mass  of  the  lamina  per  unit  length,  so  that  the 
mass  of  the  lamina  on  the  interval  from  x to  x + dx  is  dm  = t(x)  dx. 
Integrating  this  result  over  the  segment  of  the  x-axis,  we  get  the  mass 
of  the  lamina.  (Verify  the  expression  just  found  for  the  mass  by 
carrying  out  the  integration  in  reverse  order,  first  with  respect  to  x 
and  then  with  respect  to  yf  and  also  by  decomposing  (42)  into  three 
terms  and  applying  (41)  to  each  one  of  the  appropriate  integrals.) 
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Now  suppose  the  lamina  is  of  triangular  shape  (Fig.  38)  and  the 
surface  density  varies  by  the  same  law  (42),  with  inner  integration 
being  carried  out  with  respect  to  y,  for  x fixed,  from  y = 0 to  y = 
= bxja  (Fig.  38).  The  computations  proceed  as  follows: 

a bxja 

m =^dx^  p0(/i  + *x  + Py)  dy 
o o 


0 

lab  1 a2b  . a ab2\ 

As  a check,  perform  the  integration  in  the  reverse  order.  If  the 
inner  integration  is  performed  with  respect  to  ^ with  y constant,  then 
the  limits  of  integration  are  (Fig.  39)  x = ayjb  and  x = a , whence 


o u 

= Ij  dy  ^ p0(h  + a.x  + (3 y)  dx 

0 ayjb 

= \ *{h{a  ~ 7 -r)  + i(“!  “ $■ y2)  + ~ T y)] dy 

= bl  + aa--t. 

1 b 2 2 262  3 * 2 b 3 J 


The  result  is  the  same. 

Other  types  of  integrals  are  evaluated  in  the  same  manner.  One 
always  has  to  express  d£l  in  terms  of  the  differentials  of  the  coordi- 
nates, set  the  limits  in  accord  with  these  coordinates,  and  then  eva- 
luate the  resulting  integrals  by  the  usual  rules  of  integral  calculus. 
Integrals  over  areas  are  double,  integrals  over  volumes  are  triple 
(with  three  integral  signs). 

If  after  passing  to  the  coordinates  the  integrand  turns  out  to 
be  independent  of  one  of  the  coordinates,  then  the  integration  with 
respect  to  that  coordinate  is  performed  in  trivial  fashion,  and  w*e 
straightway  pass  from  a double  integral  to  a single  integral,  and  from 
a triple  integral  to  a double  or  even  to  a single  integral.  For  instance, 
the  area  of  the  figure  (ii)  depicted  in  Fig.  36  may  be  found  as  a double 
integral  of  unity  (see  the  properties  of  an  integral) : 


a = 


(«) 
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Fig.  38 


Fig.  39 


To  pass  to  a single  integral  we  set  the  limits 

b ys(x)  b 

Q.  = ^dx  ^ dy  — ^[y2(x)  — yA*)]  dx 

a yj*)  a 

We  have  an  obvious  formula  if  we  recall  the  geometric  meaning 

of  a definite  integral.  Similarly,  when  computing  a volume  we  can 
pass  from  a triple  integral  straightway  to  a double  integral.  Actually, 
that  is  what  we  did  when  we  calculated  the  mass  of  the  lamina. 

To  illustrate  the  application  of  double  integrals  let  us  evaluate 
the  integral 

oo 

K = [ e -*•  dx 
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Fig.  40 


that  was  mentioned  in  Sec.  3.2.  To  do  this  we  have  to  consider  the 
auxiliary  integral 

I = ^ e-**-y2  dCl 

(O) 

where  (Q)  is  the  complete  #y-plane.  Setting  the  limits  in  terms  of 
the  coordinates  x and  y w’hen  d£l  — dx  dy,  we  get  (taking  into  account 
formula  (41)) 

ao  oo  ao  ao 

I = ^ dx  ^ e~x2~y2  dy  = ^ dx  ^ e~x'e~y 2 dy 

— oo  — oo  — ao  — ao 

ao  ao 

= ^ e~x*dx  ^ e~y2  dy  = K2 

— oo  — ao 

On  the  other  hand,  that  same  integral  I may  be  computed  with 
the  aid  of  the  polar  coordinates  p,  <p.  Then  it  is  more  natural  to  parti- 
tion the  plane  (£2)  into  subregions  by  means  of  circles  p = constant 
and  rays  9 = constant.  The  area  of  the  resulting  subregions  (one  is 
shown  in  Fig.  40)  is  equal  to  d£l  = p dcp  dp.  Since  **  + y 2 = p2. 
we  obtain  (think  about  how  the  limits  are  to  be  set) 


e~ p2  p d(p  dp  = ^ d<p  ^ e~p2  p dp 


^ ^9  *^_P2p  dp  = 9 


27T  e-p2  00  1 

0 0 = 2" 
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Equating  the  results  obtained  we  get 


K = i.e.  ^ e~xidx  = ^- 

— oo 


Exercises 

1.  Compute  the  integral  of  the  function  x sin  xy  over  the  rectangle 

0 ^ # 5^  1,  0 < y < 7t. 

2.  Compute  the  mean  value  of  the  function  u — exy  in  the  square 

1,  0<y  < 1. 

Hint.  For  the  second  integration,  expand  the  integrand  in  a 

Taylor  series. 

i i 

3.  Evaluate  the  integral  I = ^dx  ^ eyZ  dy. 

0 x 

Hint.  Reverse  the  order  of  integration. 

4.8  Multidimensional  space  and  number  of  degrees  of  freedom 

In  the  case  of  a function  of  two  variables  we  considered,  in  Sec. 
4.2,  a "plane  of  arguments”.  We  can  do  the  same  for  a function  of 
three  variables  and  consider  a "space  of  arguments”.  These  notions 
are  very  pictorial  and  it  is  therefore  desirable  to  retain  the  conception 
of  a space  of  arguments  for  the  case  of  functions  of  any  number  of 
independent  variables  exceeding  three.  This  is  done  in  the  following 
manner.  Suppose  we  have  a function  of  four  variables  u = f(x,  y , z,  t) . 
Then  we  agree  to  say  that  every  set  of  values  x,  y,  z,  t defines  a "point” 
in  the  “four-dimensional  space  of  the  arguments  x,  y,  z , t”.  There  is 
no  way  to  picture  such  a point  or  such  a space  geometrically.  Strictly 
speaking,  the  point  is  nothing  but  the  set  of  values  x,  y,  z , t and  the 
space  is  merely  the  collection  of  all  such  sets.  For  instance,  the  set 
(or  quadruple)  (—3,  0,  2,  1.3)  is  one  such  point  and  the  quadruple 
(0,  0,  0,  0)  is  another  point  (the  coordinate  origin,  in  our  case), 
and  so  on.  We  can  now  say  that  the  function  / is  defined  throughout 
the  four-dimensional  space  x , y,  z,  t or  in  a portion  (region)  of  it. 

For  the  function  z = f(x , y)  of  two  variables,  the  "argument 
space”  is  the  #y-plane,  while  the  space  of  the  arguments  and  function 
in  which  the  graph  of  the  function  is  located  is  our  ordinary  three- 
dimensional  space  x , y,  z.  In  this  case  the  graph  is  a two-dimensional 
surface  in  three-dimensional  space.  Reasoning  in  this  manner,  we  see 
that  the  "graph”  of  the  function  u = / (x,  y,  z,  t)  requires  a five- 
dimensional space  of  arguments  and  function  xy  y3  zf  t>  u.  In  order  to 
find  the  points  of  this  graph  we  have  to  assign  to  x,  yf  z,  t arbitrary 
values  and  then  find  the  corresponding  values  of  u . For  example, 
verify  that  the  graph  of  the  function  u = xz  — 2 y2t  passes  through 
the  points  (1,  1,  2,  0,  2),  (—1,  2,  0,  —2,  16),  and  so  on.  Thus,  the 
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relationship  u = f(x,  y,  z,  t)  defines  a four-dimensional  surface  in 
five-dimensional  space. 

Directly  connected  with  this  terminology  is  the  concept  of  the 
number  of  degrees  of  freedom . We  know  that  the  position  of  a point 
in  (ordinary)  space  may  be  described  by  the  Cartesian  coordinates 
x,  y,  z.  There  are  other  coordinate  systems,  but  what  they  have 
in  common  is  that  the  position  of  a point  in  space  is  determined  by 
three  coordinates  (whereas  a point  in  a plane  is  determined  by  two 
coordinates,  and  a point  on  a line  by  a single  coordinate). 

This  fact  can  be  expressed  as  follows:  there  are  three  degrees  of 
freedom  when  choosing  a point  in  physical  space,  or,  what  is  the  same 
thing,  when  a point  is  in  motion  in  space.  There  are  two  degrees  of 
freedom  for  choosing  a point  in  a plane  or  on  any  surface,  and  one 
degree  of  freedom  for  a point  on  a line.  In  other  words,  space  is  three- 
dimensional,  whereas  surfaces  are  two-dimensional  and  lines  are  one- 
dimensional. 

In  the  general  case,  the  concept  of  the  number  of  degrees 
of  freedom  is  introduced  as  follows.  Let  there  be  a collection  of 
entities  (in  the  foregoing  example  this  was  a collection  of  points 
in  space),  each  of  which  may  be  described  by  indicating  the  numeri- 
cal values  of  certain  continuous  parameters  (they  were  coordinates 
in  the  above  case).  Let  these  parameters  be: 

(1)  independent,  that  is,  capable  of  assuming  arbitrary  values: 
for  instance,  if  we  fix  all  parameters  except  one,  then  this  one  can 
possibly  be  made  to  vary  in  an  arbitrary  manner  within  certain 
limits ; 

(2)  essential,  which  means  that  the  object  under  study  actually 
changes  for  any  change  in  the  parameters.  Then  if  there  are  k such 
parameters,  we  say  that  there  are  k degrees  of  freedom  for  choosing 
an  entity  from  the  given  collection.  The  collection  itself  is  termed  a 
(generalized)  k-dimensional  space  or  a k-dimensional  manifold . The 
parameters  are  then  termed  (generalized)  coordinates  in  the  given 
space.  As  in  the  case  of  ordinary  coordinates  in  ordinary  space,  they 
may  be  chosen  in  a variety  of  ways,  one  way  being  more  convenient 
than  another  for  any  given  investigation.  In  this  way,  multidimen- 
sional space  is  interpreted  in  a very  concrete  manner. 

For  example,  in  physics  one  constantly  has  to  do  with  collections 
of  "events”,  each  one  of  which  is  described  by  answering  the  questions 
"where”  ? and  "when”  ? To  the  first  query  we  respond  with  the  Carte- 
sian coordinates  xy  y,  zt  to  the  second,  with  the  instant  of  time  t . 
These  parameters  are  independent  (they  may  be  altered  in  arbitrary 
fashion)  and  they  are  essential  (any  variation  in  them  gives  rise  to 
a new  event).  Thus,  the  space  of  events  is  a four-dimensional  space 
where  the  generalized  coordinates  may  be  xt  y,  zf  t . 

Another  illustration.  Let  us  consider  a sequence  (system)  of 
gears,  each  of  which  is  meshed  with  the  preceding  one.  There  is  only 
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one  degree  of  freedom  here,  and  for  the  generalized  coordinate  we 
can  take  the  angle  of  rotation  of  the  first  gear  wheel,  since  any  speci- 
fication of  this  angle  completely  defines  the  position  of  all  the  other 
gear  wheels.  (How  many  degrees  of  freedom  would  there  be  if  the  gear 
wheels  were  not  engaged?) 

Let  us  determine  the  number  of  degrees  of  freedom  that  a line 
segment  of  a given  length  l has  when  in  motion  over  a plane.  Each 
such  line  segment  is  fully  determined  by  the  coordinates  (xlt  yx) 
and  (x2,  y2)  of  its  endpoints.  These  coordinates  may  be  taken  as  the 
parameters  defining  the  position  of  the  segment.  These  parameters 
are  clearly  essential  but  they  are  not  independent,  for  they  are  connect- 
ed by  the  relation 

K(**  - xi)2  + Cv* -"yi)5  = 1 

(why?).  Thus,  only  three  parameters  can  be  considered  independent, 
while  the  fourth  parameter  is  expressed  in  terms  of  these  via  the  given 
relation,  which  means  that  a line  segment  of  given  length  has  three 
degrees  of  freedom  when  in  motion  on  a plane. 

In  the  general  case,  if  there  are  n parameters  and  they  are  essen- 
tial but  are  connected  by  m independent  equations  (equations  such 
that  none  follows  from  the  others),  then  n — m parameters  can  be 
assumed  independent  and  the  remaining  m will  be  expressed  in  terms 
of  them ; in  other  words,  we  have  n — m degrees  of  freedom. 

Now  finally  let  us  count  the  number  of  degrees  of  freedom  in 
choosing  an  infinite  straight  line  in  a plane.  We  can  reason  as  follows: 
choose  two  arbitrary  points  A and  B in  the  plane  (each  has  two 
coordinates)  and  drawr  through  them  a straight  line  P,  which  will 
thus  be  defined  by  four  parameters.  Since  these  parameters  are  inde- 
pendent, there  would  seem  to  be  four  degrees  of  freedom.  This  rea- 
soning, however,  is  faulty  because  under  any  change  of  the  parame- 
ters (coordinates)  the  points  A and  B will  change  but  the  straight 
line  P may  still  remain  unchanged.  Hence  the  requirement  that  the 
parameters  be  essential  is  not  fulfilled.  Since  the  line  P does  not 
change  if  A slides  along  it  (one  degree  of  freedom)  or  B slides  along 
it  (another  degree  of  freedom),  it  follows  that  we  have  two  extra 
degrees  of  freedom,  and  the  actual  number  of  degrees  of  freedom  is 
4 — 2 = 2.  For  the  independent  and  essential  parameters  we  can  take, 
say,  the  coefficients  k and  b in  the  equation  y = kx  + b ; true,  straight 
lines  parallel  to  the  y-axis  are  not  described  by  such  equations,  but 
these  special  cases  cannot  have  any  effect  on  counting  the  number 
of  degrees  of  freedom. 

The  number  of  degrees  of  freedom  of  an  infinite  line  in  a plane 
has  turned  out  to  have  the  same  number  of  degrees  of  freedom  as 
a point  in  the  plane,  and  so  we  can  associate  straight  lines  in  a 
plane  with  points  in  that  plane  or  some  other  plane.  This  is  most 
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conveniently  done  as  follows:  the  last  equation  is  divided  by  b and 
written  as 

<xx  + Py  = 1 (43) 

Then  a point  with  coordinates  (a,  (3)  is  associated  with  the  straight 
line  having  this  equation.  It  is  then  natural  to  consider  two  planes: 
with  coordinates  xy  y and  with  coordinates  a,  p (Fig.  41).  Here,  the 
straight  line  (/)  y = 2x  + 1,  i.e.  — 2x  + y = 1 is  associated  with 
the  point  L ' of  the  (a,  p)-plane  having  coordinates  (—2,  1).  This 
same  formula  (43)  associates  with  the  point  ( x , y)  of  the  first  plane  a 
straight  line  with  equation  (43)  in  the  second  plane.  For  example, 
the  point  M( 2,  —1)  is  associated  with  the  straight  line  (mr)  having 
the  equation  2a  — p = 1,  the  point  N(  1,  3)  is  associated  with  the 
straight  line  (n')  with  equation  a + 3(3  = 1 (see  Fig.  41).  It  can  read- 
ily be  proved  in  the  general  form  as  well  that  if  a straight  line  (/) 
in  the  first  plane  passes  through  a point  N,  then  in  the  second  plane 
the  straight  line  (n')  corresponding  to  N will  pass  through  the  point 
U that  corresponds  to  the  line  (/).  We  conclude  that  any  assertion 
referring  to  an  arbitrary  combination  of  points  and  lines  in  a plane 
implies  the  “dual”  assertion,  in  which  the  lines  are  replaced  by  points 
and  the  points  by  lines.  There  is  a special  branch  of  geometry  in  which 
such  assertions  and  dual  relationships  are  studied  (it  is  called  projec- 
tive geometry).  Note  that  the  equations  of  the  lines  passing  through 
the  coordinate  origin  cannot  be  written  as  (43) ; in  other  words,  such 
lines  are  not  associated  in  the  indicated  manner  with  any  points, 
there  is  no  straight  line  that  corresponds  to  the  origin.  For  this  reason, 
in  projective  geometry  one  introduces  so-called  “points  at  infinity” 
(ideal  points)  and  “a  line  at  infinity”  (ideal  line).  Then  this  relation- 
ship has  no  exceptions.  We  will  not  dwell  further  on  this  subject  here. 

Exercise 

How  many  degrees  of  freedom  have  the  following  objects  when 
in  motion  in  space: 

(a)  A line  segment  of  given  length? 
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(b)  A triangle  with  given  sides  (a  rigid  triangle)  ? 

(c)  A rigid  body  with  a fixed  point? 

(d)  A free  rigid  body? 


ANSWERS  AND  SOLUTIONS 


Sec.  4.1 


1. 


— — 2x  and  therefore  — 

dx  dx 


\x  = l 


x=2 
'y=  0.5 


= 1. 


2. 

3. 

4. 

5. 


— = —2xe~<*2+y‘K  — = —2  Ve-<*‘+y‘K 

dx  dy  J 

— = e»  + yex,  — = xe*  + ex. 

dx  J dy 

dz  . dz 

— — sin  y,  — = x cos  y. 

dx  dy 

if  = J, COS  (xy),  if  = *cos  (xy). 
dx  dy 


6. 

dz  _ 

X 

dz  _ 

y 

]/  x2  + y2 

V x2  + y 

7. 

dt  ~ 

dz  dx  dz 

dx  dt  dy 

dt 

2% 

dt 

dy 

V-- 
dt 

Noting  that  — = 1 and  — = 1 

dt  t2  dt 


— , we  obtain 

2 \t 


8. 


9. 

10. 


^2((  + I)(1-I)-2((+1/o(1  + 


- = e?-y*  (cos  t — It). 

dt 

— = 6(x2  + y2)  t — 

dt 

dz  __  1 -j-  2*ey 
<2*  * + 


2^j 

=-(?+3t/7+1)' 


Sec.  4.2 

1.  A family  of  straight  lines  passing  through  the  origin. 

2.  z = x2  — c 2,  a family  of  parabolas  (Fig.  42). 

3.  A family  of  hyperbolas  located  in  the  upper  half-plane  (Fig.  43). 

4.  A family  of  hyperbolas  (Fig.  43). 
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— = — 4 sin  t ; the  value  of  t corresponding  to  the  value 

dx 

x = -i  is  obtained  from  the  equation  — — sin  t , whence  t = 


TT  rfy 
6 


= -4 


— = — 2.  The  other  value  t = 2tz  — — 
2 6 


yields  — = 2. 

3*  1 3*  3y2  -b  x c . i . 

— — , — — — . Setting  v = 0,  z = 1 in  the 

dz  3x2  -f  y,  dy  3*2  + y 

formula  y = x3  y 3 -f-  xy,  we  get  x3  = 1,  whence  x = 1. 

For  this  reason, 

dx  I 1 3#  I 1 


dz  |,=o 

z=  1 


dy  |y-o 


dx 

1 

dx  _ 

2xy  + 5y4 

dy 

5x*  + y* 

dy 

5*4  + y2 

dy 

2 — x 

dx 

y- 5 

dy 

*t 

1 

(N 

1 

- 4 x3y 

dx 

> 

1 

rf 

I 

— 2 x2y 

Sec.  4.4 

5 = 4.5  milliampere/volt,  R = 6.7  kohms,  (a  = 30. 

Sec.  4.5 


Since  l = — , in  the  case  that  interests  us  l = 1020. 

8 

x 2 

The  equation  of  the  envelope  is  of  the  form  y = 510 

For  x = 500  m we  get  jy  = 390  m.  Hence,  a target  can  be  hit 
at  300  metres  altitude  but  not  at  500  metres  altitude. 


The  function  has:  (a)  a minimum  at  the  point  x = 0,  y = —2; 

2 4 

(b)  a minimax  at  the  point  x = y = — — ; (c)  a minimax  at 
x = 0,  y = 0,  £ = 1. 

Sec.  4.7 

1 7t  1 1 

1.  ^^sin  xv dy  = ^dx(  —cos  xy)p  = J (1  — cos  tzx)  dx 

0 0 0 0 

= (*  = 1. 

\ W ;|0 
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Fig.  44 


Note  that  if  the  integration  were  reversed  (first  with  respect  to 
x and  then  with  respect  to  y),  we  would  have  to  apply  integra- 
tion by  parts. 

2.  Since  the  area  of  a square  is  1,  it  follows  that 


ill  i 


3.  In  the  inner  integral  the  integration  with  respect  to  y is  carried 
out  from  y = x to  y = 1 , and  so  the  entire  double  integral  is 
extended  over  the  triangle  shown  in  Fig.  44.  Reversing  the  order 
of  integration,  we  get 

1 .v  i 2 

I = ^ dy  C dx  = ^ tfy  dy  = ^ = 0.859 

ob  6 

It  is  interesting  to  note  that  in  this  example  integration  cannot 
be  carried  out  in  the  original  order  with  the  aid  of  elementary 
functions,  but  reversing  the  order  of  integration  permits  this  to 
be  done. 
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(a)  5.  (b)  6,  (c)  3,  (d)  6. 


Chapter  5 

FUNCTIONS  OF  A COMPLEX  VARIABLE 


5.1  Basic  properties  of  complex  numbers 

In  algebra  we  have  what  is  called  the  imaginary 
unit , which  is  denoted  by  i and  is  defined  by  the 
condition  i2  — — 1. 

Quantities  of  the  form  x + iy  (where  x and  y 
are  real  numbers)  are  called  complex  numbers, 
and  are  obtained,  for  example,  in  the  solution 
of  algebraic  equations.  The  imaginary  unit 
itself  is  clearly  the  square  root  of  the  quadratic 
equation  x2  = — 1. 

If  we  confine  ourselves  to  real  roots,  then  a 
quadratic  equation  has  either  two  roots  or  none, 
depending  on  its  coefficients.  However,  if  we  consider  all  the  roots, 
real  and  imaginary,  then  a quadratic  equation  always  has  two  roots. 
In  the  same  way,  a third-degree  equation  always  has  three  roots, 
and  so  forth.  Thus,  the  study  of  complex  numbers  enables  us  to  esta- 
blish a general  theorem  on  the  number  of  roots  of  an  algebraic  equa- 
tion. In  the  process  we  perform  algebraic  operations  with  complex 
quantities.  If  we  examine  more  complicated  functions  of  complex 
quantities  (raising  to  a complex  power,  for  instance),  we  get  many 
important  and  elegant  results.  A great  variety  of  relationships  bet- 
ween real  quantities  are  conveniently  obtained  through  the  use  of 
complex  numbers.  * It  is  precisely  for  this  reason  that  complex  quan- 
tities are  so  important  to  physics  and  other  natural  sciences,  despite 
the  fact  that  all  measurements  and  the  results  of  all  experiments 
are  expressed  as  real  numbers. 

Let  us  recall  some  of  the  properties  of  complex  numbers  from  the 
school  course  of  algebra. 

A complex  number  z ~ x + iy  can  be  represented  as  a point 
in  a plane  (Fig.  45),  where  x islaid  off  on  the  axis  of  abscissas  (this 
is  the  real  part  of  the  number),  and  y on  the  axis  of  ordinates  (for  the 
■imaginary  part  of  the  number  z ; sometimes  the  product  iy  is  also 
termed  the  imaginary  part  of  the  complex  number ; this  might  appear 


According  to  the  celebrated  French  mathematician  Jacques  Hadamard  (1865- 
1963),  "The  shortest  path  between  two  truths  in  the  real  domain  passes 
through  the  complex  domain/’ 
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Fig.  45 
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to  be  more  natural  but  it  is  more  convenient  to  adhere  to  the  defini- 
tion given  above).  To  every  z there  corresponds  a definite  point  in 
the  plane.  Instead  of  a point  z we  can  speak  of  a vector*  z,  which  is 
a directed  line  segment  with  origin  at  the  origin  of  the  coordinate 
system,  and  with  terminus  at  the  point  z.  The  position  of  a point 
in  a plane  may  be  described  by  its  distance  r from  the  coordinate 
origin  and  the  angle  9 between  the  vector  z and  the  %-axis.  We  can 
thus  speak  of  the  length  of  the  vector  z (also  called  the  absolute  value , 
or  modulus , of  the  complex  number,  which  is  denoted  by  | z | ) and 
we  can  speak  of  the  direction  of  the  vector  z (it  is  indicated  by  the 
angle  9).  The  angle  9 is  termed  the  argument  ( amplitude  or  phase) 
of  the  complex  number.  It  is  reckoned  from  the  positive  #-axis 
counterclockwise.  For  a positive  real  number,  9 = 0,  for  a pure 

imaginary  number  (the  number  2 iy  for  example),  9 = or  (say  for 
the  number  —2 i)  9 = -^  • 

As  can  be  seen  in  Fig.  45,  x = r cos  9,  y = r sin  9,  so  that  we 
can  express  z in  terms  of  r and  9 : 

z = # + iy  — r( cos  9 + i sin  9) 

The  notation  2 = r( cos  9 + i sin  9)  is  called  the  trigonometric  (or 
modulus-argument)  form  of  a complex  number.  Also  observe  that  if 
we  know  x and  y,  it  is  simple  (see  Fig.  45)  to  find  r and  9 from  the 
formulas 


r — ]fx2  + y2,  sin  9 = 


Y x2  + y2 


cos  9 = 


Algebraic  operations  with  complex  numbers  are  performed  by 
the  ordinary  rules  of  algebra  with  the  convention  that  i2  = — 1.  It 


See  Ch.  9 for  more  on  the  theory  of  vectors.  Here,  any  basic  school  course 
of  physics  will  suffice  as  an  introduction  to  vectors. 
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Fig.  46 


Fig.  47 


is  useful  to  take  a good  look  at  the  geometric  picture  of  algebraic 
operations.  For  the  addition  of  two  numbers  zy  = xy  + iyy  and 
z2  = x2  -j-  iyz  we  have 

z = zy  + z2  = (xy  -f  Xo)  + i{yx  + y2) 

It  is  readily  seen  that  in  the  plane  the  number  z is  represented  by  a 
vector  obtained  from  the  vectors  zy  and  z2  by  adding  via  the  paral- 
lelogram rule,  that  is,  in  the  same  way,  say,  as  the  result  of  two  forces 
(Fig.  46). 
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The  product  of  a complex  number  z by  a positive  real  number 
k , = for,  is  a vector  in  the  same  direction  as  £ but  with  modulus 
(length)  equal  to  kr,  where  r is  the  modulus  of  z.  But  if  k is  a negative 
real  number,  then  the  vector  kz  has  modulus  \ k \ r,  but  with  direction 
opposite  that  of  z.  It  is  also  easy  to  construct  the  product  z1  of  the 
complex  number  z = x -f  iy  by  the  imaginary  unit  i.  We  have  zx  m 
= i(x  + iy)  = — v + From  the  drawing  (see  Fig.  41  where 
for  pictorialness  we  have  used  a and  b instead  of  x and  y)  it  is  easy 
to  see  that  the  vector  z±  is  perpendicular  to  z.  It  has  been  rotated  coun- 
terclockwise from  z through  a right  angle.  The  modulus  of  zL  is  equal 
to  that  of  z. 

The  product  of  two  complex  numbers  is  found  from  the  simple 
formula 

(*i  + iyi)  (*2  + iy*)  = xix2  + x\iy 2 + mx2  + i2y*y 2 
= (*1*2  - j'i.r2)  + l(xiy2  + vi*j) 

In  algebra  there  is  a simple  proof  (carry  it  out!)  which  makes  use 
of  the  formulas  for  the  sine  and  cosine  of  a sum.  It  states  that  if 

Zt  = rx(c os  <pA  + i sin  <fx)9  z2  = r2(c os  + i sin  <pB) 

then 

« = hzs  = rir2\cos  (<pj  + ?2)  + * sin  (9,  + o2)] 

When  multiplying  complex  numbers,  multiply  their  moduli  and 
add  the  arguments.  We  will  obtain  the  same  result  in  Sec.  5.3  in  a 
different  way. 

Note  in  conclusion  that  complex  numbers  cannot  be  connected 
by  an  inequality  sign:  one  complex  number  cannot  be  greater  or 
less  than  another.  It  might  appear  to  be  natural  to  say  that  zx  > z2 
if  \ zx\  > \ z2\,  that  is,  to  compare  complex  numbers  via  their 
moduli.  But  then  we  would  have  to  write,  say,  —3  > — 1 for  real 
numbers,  but  this  is  a contradiction. 

Exercise 

Write  the  following  numbers  in  trigonometric  form : 1 — i, 

3 + 4i,  — 2 i,  — 3,  1,  0. 

5.2  Conjugate  complex  numbers 

Two  complex  numbers  that  differ  only  in  the  sign  of  the  imaginary 
part  are  called  conjugate  complex  numbers.  The  conjugate  of  z will 
be  denoted  by  z*.  If  z = x + iy,  then  z*  = x — iy.  In  particular, 
for  real  numbers,  and  only  for  them,  it  is  true  that  A*  ==  A,  since 
in  this  case  the  imaginary  part  is  zero.  For  pure  imaginaries  (that  is, 
numbers  with  real  part  zero),  and  only  for  them,  we  have  A * = —A. 


11-1634 


162  Functions  of  a complex  variable  CH.  5 

Suppose  we  have  two  complex  numbers  zt  — xx~ f-  iyl  and 
z2  = x2  + iy 2*  Adding  them,  we  get 

zx  + z2  = (xx  + x2)  + i(yx  + y2) 

Now  form  the  sum  of  the  conjugates 

z\  + zl  = (xl  + x2)  - i(yt  + y2) 

We  thus  have  a complex  number  conjugate  to  the  sum  zx  + z2. 

Thus,  if  zx  -f-  z2  — w,  then  z\  + z!2  = w*  ; that  is,  if  the  terms  in 
the  sum  are  replaced  by  their  conjugates,  then  the  sum  becomes 
conjugate  as  well. 

We  now  show  that  a similar  property  holds  true  for  the  product 
of  complex  numbers: 

if  zxzz  = w , then  z\z\  — wn  (l) 

Indeed, 

ztz2  = (xx  + iyx)  (x2  + iy2)  = (xxx2  — yxy 2)  + i(xxy2  + x2yx) 

z[z'2  = (xx  — iyx)  (x2  — iy2)  = {xxx2  — yxy2)  — i(xxy2  + x2yx) 

Comparing  these  two  equations,  we  see  that  (1)  holds  true. 

Putting  zx  = z2  in  (1),  we  see  that  if  z\  = w,  then  {z\'f  = w' ; 
that  is,  squaring  conjugate  numbers  results  in  conjugates.  It  is  clear 
that  this  is  valid  for  arbitrary  positive  integer  powers.  Suppose,  say, 
= w.  We  write  this  as  z2z  = w.  By  ( 1 ) it  follows  that  {z2)*z*  = w* 
but  {z2Y  = (z*)2  so  that  (^*)3  = w\  Thus,  z3  = w implies  (. z *)3  = w*. 

• z z* 

It  is  easy  to  establish  that  if  — = w,  then  — 1 - = w * (see 

*%  4 

Exercise  1).  Combining  these  results,  we  obtain  the  following  general 
proposition. 

Suppose  the  complex  numbers  zv  z2 zn  are  involved  in  a 
certain  number  of  arithmetical  operations  (additions,  multiplications, 
divisions,  raising  to  integral  powers)  and  the  result  has  proved  equal 
to  w.  Then  if  these  same  operations  (in  the  same  order)  are  performed 
on  the  numbers  z\,  z\t  ...,  z*,  the  result  will  be  w% . In  other  words, 
if  all  complex  numbers  in  an  equation  are  replaced  by  their  con- 
jugates, then  equation  holds  true.  It  can  be  shown  that  this  rule 
is  also  valid  for  nonarithmetic  operations  (taking  roots  and  logarithms, 
etc.).  From  this  it  follows  that  any  relation  involving  complex  num- 
bers holds  true  if  i is  everywhere  replaced  by  - it  for  this  means  that 
we  have  merely  passed  from  an  equation  of  complex  numbers  to  one 
of  their  conjugates.  Thus,  the  numbers  i and  —i  are  essentially  in- 
distinguishable. * (This  of  course  does  not  mean  that  2 + 3i  = 2 — 3i!) 


We  can  obtain  this  result  in  straightforward  fashion  if  it  is  observed  that 
( — t) 2 = — I and  also  i2  = — 1,  whence  — i and  i have  the  same  properties. 
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Let  us  consider  a quadratic  equation  with  real  coefficients: 

az2  + bz  + c = 0 (2) 

Let  zQ  = x0  + iy0  be  a root  of  this  equation.  Then 

az\  + bz0  + c = 0 

Replacing  all  numbers  here  by  their  conjugates  and  recalling  that 
a * = a,  b*  ~ b,  c*  = c because  a , b,  c are  real  numbers,  we  obtain 

a{z0)2  -f-  bzQ  -)-  c = 0 =0 

which  is  to  say  that  z*Q  is  also  a root  of  this  quadratic  equation. 
Hence  if  a quadratic  equation  with  real  coefficients  has  imaginary 
(nonreal,*  that  is)  roots,  then  these  roots  are  conjugate  complex  num- 
bers. Incidentally,  this  is  immediately  apparent  from  the  quadratic 
formula.  Equation  (2)  has  the  roots 

- _ - b ±]!b2  - Aac 

^ 1 O 


The  roots  are  imaginary  if  b-  — 4 ac  < 0,  and 


b , Uac  - 62 
= 2 a 


b 

2 a 


— I ■ 


. y Aac  — b'1 


2 a 


These  are  conjugate  complex  numbers. 

The  value  of  this  device  lies  in  the  fact  that  it  applies  not  only 
to  quadratic  equations  but  to  equations  of  any  degree.  Indeed, 
suppose  the  equation 

a0zn  + &iZn~l  + ...  + an~iz  + an  — ^ (3) 

with  real  coefficients  aQt  a1 ...,  an  has  the  imaginary  root  zQ  ~ xQ  + iyQ . 
Then 

a0ZZ  + a\zv~X  + •••  + an-lZQ  + an  — 0 

Replacing  all  the  numbers  by  their  conjugates  in  this  equation 
and  noting  that  a*Q  = aQt  a\  = av  ...,  an  = ant  we  get 

<*o(z'o)n  "T"  a\[zo)n~l  + + an- izo  + an  = 0*  = 0 

Therefore  za0  = x0  — iyQ  is  also  a root  of  (3).  Hence  in  the  case  of 
a real  polynomial  (that  is,  a polynomial  with  real  coefficients)  ima- 
ginary roots  can  only  appear  as  conjugate  pairs:  if  there  exists  a root 
Xq  + iy0 , then  there  is  also  a root  x0  — iy0 . 

From  this  it  also  follows  that  a real  polynomial  can  have  only  an 
even  number  of  imaginary  roots.  In  higher  algebra,  proof  is  given 
that  a polynomial  of  degree  n has  exactly  n roots.  For  this  reason. 


Do  not  confuse  the  notion  imaginary  (which  means  nonreal)  with  purely  ima- 
ginary (which  means  a real  part  equal  to  zero). 
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a real  polynomial  of  odd  degree  must  have  at  least  one  real  root. 
(Prove  this  in  another  way  by  considering  the  behaviour  of  the  graph 
of  a polynomial  P(x)  of  odd  degree  for  large  | .rj.)  A polynomial  of 
even  degree  may  not  have  real  roots. 

From  school  algebra  we  know  that  if  the  equation  (3)  has  the 
roots  zv  z2 zn,  then  the  polynomial  on  the  left  of  (3)  can  be  decom- 
posed into  factors  as  follows: 

a0zn  + a 1z"-1  -f  ...  + an_,z  + an  = a0(z  - zj  (z  - z2)  ...  (z  - zj  (4) 

Therefore  if  x0  + iy0  is  a root  of  the  polynomial,  then  in  the  factor- 
ization of  this  polynomial  there  is  a complex  factor  (z  — x0  — iy0). 
Since  in  that  case  the  polynomial  also  has  the  root  x0  — iyQ,  the 
factorization  likewise  involves  the  factor  (z  — x0  + iy0).  In  order  to 
avoid  dealing  with  imaginary  numbers  in  (4),  these  two  factors  can 
conveniently  be  combined  into  one: 

(z  — x0  — iy0)  (z  — x0  + iy0)  = z2  — 2x0z  + *5  + y% 

In  place  of  two  factors  with  imaginary  coefficients  we  obtain  one 
factor  with  real  coefficients,  but  of  second  degree  in  the  variable  z. 
Thus,  from  the  fundamental  theorem  of  algebra  (which  we  did  not 
prove  but  merely  took  on  trust)  it  follows  that  any  real  polynomial 
can  be  decomposed  into  real  factors  involving  the  variable  z to  the 
first  and  second  powers  but  to  no  higher  power. 

Exercises 

1.  Show  that  if  — = w,  then  ^ = w*. 

Z 2 ^ 2 

2.  Find  all  the  roots  of  the  polynomial  z4  — 6z3  + 1 lz2  — 2z  — 10 
if  we  know  that  one  of  the  roots  is  equal  to  2 — i. 

3.  Demonstrate  that  = z,  4.  Let  z = x + iy  and  then  find  zz\ 

5.3  Raising  a number  to  an  imaginary  power.  Euler's  formula 

We  now  take  up  the  very  important  question  of  what  raising 
a number  to  an  imaginary  power  means. 

How  does  one  approach  such  a question  ? First  let  us  see  how  the 
simpler  problem  of  negative  and  fractional  powers  is  dealt  with  ir> 
algebra.  The  definition  of  positive  integral  powers  is  the  only  straight- 
forward and  pictorial  one: 

a1  = a,  a2  = a • a,  a3  = w a - a,  ...,  an  = a*  a ...  a 


From  this  definition  stem  the  following  rules : 

^ = an-m(ii  n > m),  {an)m  = anm 


n times 


Sec.  5.3 


Euler’s  formula 
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It  is  then  assumed  that  these  rules  hold  true  for  all  exponents. 
From  this  follows  the  definition  for  fractional  and  negative  powers. 
For  example,  from  the  formula 

I1 

anJ  = an  = a1  = a it  follows  that  a”  is  a number  which,  when 

raised  to  the  power  n,  yields  a , or  atl  is  fa.  In  the  same  way,  we 
have  a0  = 1 , a~n  = — • 

an 

With  this  approach  there  is  no  way  of  defining  a*.  Let  us  try  to 
do  so  using  what  we  know  about  the  derivative  of  an  exponential 
function,  that  is  to  say,  using  the  tools  of  higher  mathematics.  The 
simplest  and  most  convenient  formulas,  with  no  extra  coefficients, 
are  the  formulas  for  differentiating  e\  ekt : 


del 

dt 


e\  ^ = kekt- 

dt 


These  formulas  serve  as  a starting  point  for  defining  the  notion  of 
raising  a number  to  an  imaginary  power: 

= ieu 

dt 


Set  elt  — ::(/),  then  — — iz.  The  relationship  between  z and  dz  is 

dt 

shown  in  Fig.  48.  Since  dz  = iz  dt , it  follows  that  for  real  t and  dt, 
dz  is  perpendicular  to  z.  Hence,  geometrically  the  change  in  z = elt, 
with  increasing  t,  by  the  amount  dt  amounts  to  a rotation  of  the  vector 
z through  the  angle  dq.  From  the  figure  it  is  clear  that  since  dz  is 
perpendicular  to  z,  then  d 9 is  equal  to  the  ratio  of  the  length  of  the 
line  segment  dz  (which  is  equal  to  r dt)  to  the  length  of  the  line  segment 
z (which  is  equal  to  r).  Therefore  ^9  = dt.  Of  course  the  angles  here 
(9,  drp  and  so  on)  must  be  expressed  in  natural  units  (radians  and 
not  degrees). 

For  t = 0 we  have  eu  |<=0  — e°  = 1,  or  2(0)  is  a horizontal  vector 
whose  length  is  equal  to  unity.  Since  the  change  in  t by  dt  is  associat- 
ed with  a rotation  of  the  vector  2 through  the  angle  dy  = dt,  the 
variation  of  t from  0 to  the  given  value  tx  is  associated  with  the 
rotation  through  the  angle  9 = tv 

Thus,  z = e'*  is  a vector  obtained  from  z{ 0)  = 1 by  a rotation 
through  the  angle  9.  Set  z = e'*  = x + n'.  From  Fig.  49  it  is 
clear  that  x = l • cos  9 = cos  9,  y = 1 • sin  9 = sin  9 and  so 

e*9  = cos  9 + i sin  9 (5) 


It  is  precisely  the  condition  that  the  formulas  have  a simple  aspect  is  what 
defines  the  number  e (see,  for  example,  HM,  Sec.  3.8). 
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Fig.  48 


Fig.  49 


This  is  Euler's  formula . It  is  easy  to  verify  that  we  then  have 
— ( e '*)  = ie'9.  Indeed, 

d< p 

— (e:?)  = — (cos  9)  + i — (sin  9)  = — sin  9 -f  i cos  9 

<?9  ^9 

= ii  sin  9 -f-  i cos  9 = i{ cos  9 -j - i sin  9)  = ie'9 

Euler's  formula  was  obtained  in  a different  manner  in  HM, 
Sec.  3.18:  via  an  expansion  of  the  left  and  right  members  in  a Tay- 
lor series.  But  the  definition  itself  of  the  functions  e\  eu  in  HM 
was  based  on  expressing  the  derivative  of  the  exponential  function, 
which  means  both  proofs  are  fundamentally  the  same. 

Using  Euler's  formula,  we  can  write  any  complex  number  z 
with  modulus  r and  argument  9 thus: 

z = r(cos  9 + i sin  9)  = re' 9 

A complex  number  in  this  notation,  z = re'9,  is  said  to  be  in  ex- 
ponential form.  Here,  the  rule  of  adding  arguments  when  multi- 
plying complex  numbers  becomes  a simple  consequence  of  the  addi- 
tion of  exponents  in  the  multiplication  of  powers. 
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Indeed,  let  zx  — rxe?*',  z2  — r2e{** ; then 

z — zxz2  = * r2ei^i  = r1r2  • e 'tei+vJ  = ^1<p 

where 

r = rxr2,  9 = <Pi  + ?a 

From  Euler's  formula  it  is  easy  to  obtain  formulas  for  the  sine 
and  cosine  of  a sum: 

= cos  (9  + ^)  + i sin  (9  + <W 

On  the  other  hand, 

eil^^  = e'^e**  = (cos  9 + 2:  sin  9)  (cos  ^ + i sin  '9) 

= cos  9 cos  9 — sin  9 sin 
4*  f(sin  9 cos  + cos  9 sin 

Comparing  the  real  and  imaginary  parts  in  these  two  expressions 
for  we  get 

cos  (9  + 6)  = cos  9 cos  ^ — sin  9 sin 
sin  (9  + <J>)  = sin  9 cos  tj>  + sin  ^ cos  9 

Of  course,  in  this  simple  case  we  obtained  the  familiar  formulas 
that  are  readily  proved  in  trigonometry  without  resorting  to  complex 
numbers.  But  if,  say,  it  is  necessary  to  express  cos  59  and  sin  59 
in  terms  of  cos  9 and  sin  9,  then  the  use  of  Euler's  formula  is  more 
convenient,  practically  speaking,  than  ordinary  trigonometric  trans- 
formations. 

This  is  done  as  follows: 

(1 e I<?)5  — elS*  = cos  59  4 i sin  59 
On  the  other  hand, 

e ~ cos  9 4 i sin  9 

and  so 


(ei<p)5  = (cos  9 4 i sin  9)5 

— cos5  9 + i5  cos4  9 sin  9 — 10  cos3  9 sin2  9 

— i 10  cos2  9 sin3  9 + 5 cos  9 sin4  9 + i sin5  9 * 

(The  powering  was  done  by  the  binomial  theorem.) 
Comparing  the  real  and  imaginary  parts,  we  get 

cos  59  = cos5  9 — 10  cos3  9 sin2  9 + 5 cos  9 sin4  9 
sin  59  = sin5  9 — 10  sin3  9 cos2  9 + 5 sin  9 cos4  9 


Replace  the  exponent  5 by  any  integer  ?i  and  we  get  the  identity  (cos  9 + 
-f  i sin  <p)n  = cos  W9  + i sin  ny  which  is  known  as  the  de  Moivfe  formula. 
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Here  is  another  example  of  the  use  of  Euler's  formula.  Suppose 


it  is  required  to  find  ^ eaz  cos  bx  dx.  We  consider  the  equation 


s 


e{a-\  ib)x  fix  __  __ 


e{a  + ib)x 

a -\-  ib 


c 


(6) 


where  C = Cx  + iC2.  Note  that  by  Euler’s  formula 

^ e{a-\-ib)x  dx  _ ^ eax  cos  j)%  i^cax  si n bx  dx 

By  means  of  the  following  manipulations  we  can  separate  the  real 
part  of  the  right-hand  member  of  (6)  from  the  imaginary  part: 

e{a,+ib)x  ™ caxeibx  = eax(cos  bx  + i sin  bx) 

1 a — ib  a b 


a + ib  (a  + ib)  {a  — ib)  a 2 + b 2 


a2  -f*  b 2 


Multiplying,  we  find  that  the  quotient  in  the  right-hand  member 
of  (6)  is 

eax  . cax 

( a cos  bx  + b sin  bx)  -f-  i — (a  sin  bx  — b cos  bx) 

a2  + b2  a 2 b2 

Then  equation  (6)  becomes 


cos  bx  dx 


eax  sin  bx  dx 


= (a  cos  bx  + b sin  bx)  -}-  i — (a  sin  bx  — b cos  bx)  + C 

a 2 + b 2 a 2 + b 2 

Comparing  the  real  and  imaginary  parts,  we  obtain 

C a-r  , , enz(a  cos  bx  -f  b sin  bx)  . ~ 

V eax  cos  bx  dx  = — ^ 1 + Cx 


i eax  sin  bx  dx  = 


a2  + b 2 
eax(a  sin  bx  — b cos  bx) 
a * + b2 


+ cf 


Exercises 


1.  Write  down  the  following  numbers  in  exponential  form: 

(a)  1 + iy  (b)  1 — i,  (c)  — 1,  (d)  3/. 

2.  Using  Euler’s  formula,  find 


(a)  (I  (b)(y  + ^rf 


3.  Express  cos  3cp,  sin  4?  in  terms  of  sin  9 and  cos  9. 

4.  Prove  the  formulas 


cos  9 — 


e v -f  e~ 


sin  9 — 


- e-** 
2 i 
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5 A Logarithms  and  roots 

It  is  a remarkable  fact  that  an  exponential  function  with  ima- 
ginary exponents  becomes  periodical.  From  Euler's  formula  it  follows 
that  e2Ki  = cos  2tz  + i sin  2n  = 1 and  so  £l(/+2n:>  = eltelni  = eu,  that 
is  to  say,  the  function  z(t)  = eu  is  periodic  with  period  2u. 

The  periodic  properties  of  a power  are  evident  in  the  following 
simple  instances: 

1.  (-1)0=  1,  ( l)i  — 1*  (~1)2=  1, 

(-1)3=  -1,...,  (-l)2«+<  = (-i)ff 

2.  i°  = 1,  i1  = i,  i2  = —1,  i3  = —i,  i*  =1,  i5  ~ i, 
iG  = — 1,  iAn+s  = is. 

From  these  examples  we  conclude  that  (—  1)5  as  a function  of  g 
is  of  period  2,  and  i8  as  a function  of  s is  of  period  4. 

We  will  now  demonstrate  that  the  periodicity  of  these  functions 
is  a consequence  of  the  periodicity  of  the  function  elt.  Using 
Euler's  formula,  write  — 1 in  exponential  form:  — 1 = eni.  Therefore 
(—1)5  = eing,  where  g is  any  number.  Now  let  us  determine  the  period 
of  the  function  eiKg  for  real  g.  If  G is  the  period,  then  £**(*+<?)  = ems. 
Then  it  must  be  true  that  einG  = 1,  whence  7 zG  = 2tz , G = 2.  Simi- 

. tt  . res  .ns 

l — .1  X — 

larly,  i — e 2 and  so  i*  = e 2 . The  period  of  the  function  e 2 is  4. 

The  periodicity  of  the  exponential  function  has  exciting  conse- 
quences for  logarithms.  The  logarithm  of  a complex  numbers  is  a 
complex  number  w such  that  ew  = z. 

Let  2 = re**.  Write  r as  r = e p,  where  p is  the  natural  loga- 
rithm of  the  positive  number  r.  Then  r = &+**  and  so 

w = In  z = p + tep  = In  r + i 9 

But  we  already  know  that  e2nki  — 1 if  k is  an  integer,  and  so  we 
can  write  z = e*+i*+2nkit  whence  In  z = In  r -f-  i( 9 + 2kn). 

Thus,  the  logarithm  of  a complex  number  has  infinitely  many 
values.  This  situation  is  reminiscent  of  that  existing  between  trigono- 
metric and  inverse  trigonometric  functions.  For  example,  since  tan  9 
has  a period  equal  to  tt,  that  is,  tan  (9  -f  kn)  = tan  9 if  k is  in- 
tegral, then  arctan  * has  an  infinity  of  values.  Indeed,  if  9 = arctan 
x is  one  of  the  values  of  the  arctangent,  then  9 + &7r  is  also  a 
value  of  the  arctangent  ( k an  integer). 

The  periodicity  of  the  exponential  function  is  also  important 
in  root  extraction , that  is,  when  raising  a number  to  a fractional 
power. 

Suppose  we  have  z = rei<p ; as  we  have  seen,  this  can  be  written 
as  z = re^+2nki.  Then 


gn  __  yng(i<$+2nki)  n __  yngintjglnkin  __  Qglnkni 


(7) 
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where  c stands  for  If  n is  an  integer,  then  e2nkni  = 1,  and 

for  this  reason  we  get  only  one  value  of  zn , namely  the  number  c. 
Thus,  raising  to  an  integral  power  is  a unique  operation.  The  situa- 
tion changes  if  n is  fractional,  n = — , where  p and  q > 0 are 

tf 

prime  to  each  other  (that  is,  they  do  not  have  any  common  divisors 
other  than  unity).  Then  e2nkni  can  be  different  from  1 and  by  (7)  we 
get  new  values  of  zn  that  differ  from  c.  It  can  be  proved  that  the 
total  number  of  distinct  values  of  zn  is  equal  to  q . 

Consider  an  example.  Find  all  the  values  of  l1/3=fl. 

2tt  hi 

Since  1 = e2nki,  it  follows  that  l1/3  ~ e 3 . Putting  k — 0 in  the  last 
equation,  we  get  l1'3  = 1 ; putting  k = 1,  we  get 

i3;3  2:r  , . . 2tt  1 . • V3 

l3'3  = e 3 = cos h * sin  — = \- 1 — 

3 3 2 2 


Putting  k = 2,  we  obtain 


1 3 /3  == 


4?ri 


. • • ^7T 

= cos \~  i sin  — = 

3 3 


Thus,  we  have  three  values  for  f 1.  (Verify  by  means  of  straight- 

forward  involution  that  I — = 1.)  It  is  easy  to  see  that 

by  putting  k = 3,  4,  or  k = — 1,  —2,  ...,  we  do  not  get  new  values 
of  the  root. 

If  n is  an  integer,  then  the  roots  of  the  equation  xn  = 1 are  n 
numbers  of  the  form 


cos  - 


2Att 


t sin- 


2k7Z 


, where  k = 0,  1,  2,  n — 1 


Here,  substituting  £ = 0,  vre  get  = cos  0 + i sin  0 = 1,  which 
is  a definitely  known  value.  The  numbers  xh  are  depicted  in  the 
plane  by  points  located  at  the  vertices  of  a regular  n- gon  inscribed 
in  a circle  of  radius  1.  Indeed,  the  modulus  of  any  one  of  the  num- 
bers xk  is  equal  to  rk  = y cos2-^-  + sin2  = 1,  the  argument  of 
the  number  x0  is  zero,  and  the  argument  increases  by  — as  k is  in- 

n 

creased  by  unity.  Fig.  50  shows  the  roots  of  the  equation  x5  = 1 . 

For  the  simplest  type  of  equation  of  degree  n (n  a positive  inte- 
ger), xn  = 1,  we  saw  that  the  total  number  of  real  and  imaginary 
roots  is  equal  to  n.  Actually  (but  we  do  not  give  the  proof  here) 
every  algebraic  equation  of  degree  n has  n complex  roots , some  of  which 
may  be  real,  for  it  is  always  assumed  that  real  numbers  constitute 
a particular  case  of  complex  numbers.  The  number  of  distinct  roots 
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Logarithms  and  roots 


may  be  less  than  n,  since  some  roots  may  be  coincident  (so-called 
multiple  roots).  For  example,  the  quartic  equation 

{x  — 1)"  (%  “T  1)  = 0 

has  a triple  root  xl2, 3 = 1 and  a simple  root  x4  = — 1,  making 
a total  of  four  roots. 

We  conclude  this  section  with  a survey  of  the  process  of  increas- 
ing complication  of  mathematical  operations  and  the  concomitant 
development  of  the  number  notion.  The  starting  point  was  the  class 
(set)  of  positive  integers  (also  called  natural  numbers).  Using  these 
numbers  alone,  we  can  always  carry  out  addition,  inasmuch  as  the 
result  of  adding  two  natural  numbers  is  again  a natural  number; 
subtraction  however  is  not  always  possible.  To  make  subtraction 
possible  we  have  to  consider  negative  integers  and  zero. 

Let  us  consider  the  next  two  operations : multiplication  and  divi- 
sion. The  result  of  multiplying  two  integers  is  an  integer,  but  the 
operation  of  dividing  integers  is  not  always  possible  in  whole  numbers 
(integers).  For  division  to  be  possible  in  all  cases,  we  have  to  introduce 
fractions.  Together,  integers  and  fractions  constitute  the  class  of 
rational  numbers.  But  we  know  that  the  limit  of  a sequence  of  ra- 
tional numbers  may  be  a nonrational,  that  is,  an  irrational  number 
(these  numbers  are  further  subdivided  into  "algebraic”  and  "trans- 
cendental” numbers,  but  we  will  not  go  any  further  into  this  matter 
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here).  The  rational  and  irrational  numbers  together  form  the  class 
of  real  numbers  in  which  the  operation  of  passing  to  a limit  is  always 
possible.  However,  in  this  last  class  it  is  not  always  possible  to 
carry  out  algebraic  operations:  for  example,  it  is  possible  to  take  the 
roots  of  any  power  of  positive  numbers  but  it  is  not  possible  to  take 
the  roots  of  even  powers  (square  roots,  say)  of  negative  numbers. 

Taking  the  square  root  of  a negative  number  becomes  possible 
only  when  we  introduce  complex  numbers.  It  is  a remarkable  fact 
that  the  introduction  of  complex  numbers  is  the  final  extension  of 
the  number  system.  Indeed,  if  we  have  complex  numbers  at  our 
disposal,  then  we  can  take  the  roots  (of  any  degree)  not  only  of  ne- 
gative but  also  of  any  complex  numbers.  The  results  are  always 
complex  numbers.  But  in  mathematics  there  are  operations  that 
cannot  be  performed  if  we  remain  within  the  class  of  real  numbers 
only.  For  instance,  raising  a positive  number  to  any  real  power 
always  results  in  a positive  number,  so  that  there  are  no  logarithms 
of  negative  numbers.  Another  impossible  operation:  there  is  no 
real  number  9 such  that,  say,  cos  9 = 2.  The  question  then  arises: 
perhaps  it  is  necessary,  in  order  to  find  the  logarithms  of  nega- 
tive numbers,  to  introduce  a new  “imaginary  unit”  that  differs  from 
the  i introduced  earlier  in  order  to  be  able  to  take  the  roots  of 
even  powers  of  negative  numbers?  Will  the  solution  of  the  equation 
cos  9 = 2 require  some  other  third  “imaginary  unit”?  It  turns  out 
that  this  is  not  the  case:  with  the  introduction  of  complex  numbers 
we  are  now  able  to  take  the  logarithms  of  negative  and  even  complex 
numbers  (see  above)  and  solve  equations  of  the  form  cos  9 = k, 
where  k is  any  number  (see  Exercise  4 below),  and  so  on.  Thus,  any 
operations  involving  complex  numbers  are  possible  and  the  results 
of  such  operations  are  again  complex  numbers.  Therefore  there  is 
no  need  to  introduce  any  new  numbers. 

Exercises 

1.  Find  the  logarithms  of  the  following  numbers: 

(a)  — 1,  (b)  i,  (c)  it  Jd)  1 + i. 

2.  Find  all  the  values  of  f — 1. 

3.  Find  all  the  values  of  f 1. 

In  Problems  2 and  3 obtain  values  of  the  roots  in  trigonometric 
and  algebraic  forms.  Verify  the  values  of  the  roots  by  raising 
to  the  appropriate  power  using  the  binomial  theorem. 

4.  Solve  the  equation  cos  9 = 2.  Indicate  all  solutions. 

5.  State  all  solutions  of  the  equation  sin  9 = 2. 

Hint.  In  Problems  4 and  5 use  formulas  that  express  the  sine 
and  cosine  in  terms  of  the  exponential  function.  (See  Exercise  4 
of  Sec.  5.3.) 
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5.5  Describing  harmonic  oscillations  by  the  exponential 
function  of  an  imaginary  argument 

Knowing  that  the  exponential  function  is  periodic  for  imaginary 
exponents,  we  can  use  it  to  solve  problems  involving  mechanical 
vibrations  and  the  theory  of  electric  circuits. 

Let  us  consider,  for  example,  harmonic  oscillations  (which  are 
oscillations  that  obey  the  sinusoidal  law)  of  electric  current  in  a 
circuit  in  which  the  current  obeys  the  law 

j = j0  sin  (a ot  + a)  (8) 

Here,  j0  is  the  amplitude  of  the  oscillations  (maximum  current),  g> 
is  the  frequency,  and  a is  the  initial  phase  (the  value  of  the  phase, 
tot  + a,  at  t = 0).  It  appears  to  be  convenient  to  introduce,  in  addi- 
tion to  (8),  the  notion  of  a complex  current 

J = (9) 

for  which  the  “rear"  current  (8)  serves  as  the  imaginary  part,  since 
by  Euler's  formula  (5), 

jQe *(“*+«)  = jQ  cos  (c tit  + a)  + ij0  sin  (c tit  + a) 

The  complex  current  (9)  is  represented  in  the  complex  plane  by  a 
vector  (Fig.  51)  of  length  j0  that  forms  with  the  positive  real  axis 
an  angle  (tit  + a ; thus,  at  t = 0 this  vector  has  a slope  of  a,  and  as  t 
increases  it  rotates  uniformly  with  angular  velocity  to.  The  current 
(8)  is  obtained  by  projecting  the  terminus  of  the  vector  J on  the 
imaginary  axis.  (If  instead  of  (8)  the  law  of  current  were  given  by 
j = j0  cos  (<tit  + a),  then  it  would  be  the  real  part  of  the  complex 
current  (9)  and  would  be  obtained  by  projecting  J on  the  real  axis.) 

Expression  (9)  is  an  example  of  a complex-valued  function  of 
a real  independent  variable.  In  the  general  case,  any  such  function 
may  be  written  as 

M = g(t)  + ih(t) 

where  g(t)  and  h(t)  are  ordinary  real  functions  of  a real  independent 
variable.  Such  complex-valued  functions  have  the  following  obvious 
properties. 

If  the  complex  functions  are  added,  so  also  are  their  real  and 
imaginary  parts. 

If  a complex  function  is  multiplied  by  a real  constant  or  a real 
function,  the  real  and  imaginary  parts  are  multiplied  by  that  factor. 

If  a complex  function  is  differentiated  or  integrated,  the  same 
operations  are  performed  on  its  real  and  imaginary  parts. 

These  properties  make  it  possible  to  perform  the  indicated 
operations  on  the  whole  complex  function  (instead  of  on  the  real 
or  imaginary  part)  and  then  take  the  real  or  the  corresponding. 
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Fig.  51 


imaginary  part  of  the  result.  It  is  a remarkable  thing  that  such  a 
transition  to  complex  quantities  together  with  the  reverse  transition 
to  the  required  real  quantities  may  turn  out  to  be  simpler  and  more 
pictorial  than  straightforvvard  operations  performed  on  real  quan- 
tities. 

Let  us  consider,  for  example,  the  superposition  of  oscillations 
having  the  same  frequency.  Suppose  we  have  to  add  currents : 

j = j1  sin  (tot  + oq)  + jz  sin  (tot  + a2) 

We  have  seen  that  the  appropriate  complex  currents  Jx  and  J2  are 
represented  in  the  plane  of  the  complex  variable  by  vectors  uni- 
formly rotating  with  angular  velocity  to.  And  in  Sec.  5.1  we  de- 
monstrated that  such  vectors  are  added  by  the  parallelogram  rule. 
This  means  that  the  total  complex  current  J will  uniformly  rotate 
with  angular  velocity  to,  that  is,  it  may  be  written  as  (9),  where  j0 
and  a are  readily  obtained  geometrically.  In  Fig.  52  we  have  the 
position  of  a rotating  parallelogram  at  time  t = 0,  which  is  the 
time  we  need  for  determining  j0  and  a.  These  parameters  may  be 
obtained  via  a geometric  construction,  as  in  Fig.  52,  but  they  are 
also  obtainable  analytically.  To  find  j0  we  can  apply  the  cosine 
theorem  to  the  triangle  OAB  to  get 

jt  = OB 2 = AO2  + AB2  - 2 AO  • AB  cos  (AO^AB) 

= jl  + jl  — Vij*  cos  [180°  — (a2  — aj] 

= ji  + jl  + 2jJt  cos  (a2  - «x) 

To  find  a,  note  that  the  projections  of  the  vector  J |*=o  on  the  coordi- 
nate axes  are  respectively  equal  to 

j\  cos  ax  + j2  cos  a2  and  jx  sin  at  + /2  sin  a2 
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Fig.  52 


whence 


tan  a 


j\  sin  +/,  sin  ot2 
jL  cos  ccL  + jx  cos  a2 


Similar  results  are  obtained  in  superposing  any  number  of 
oscillations  that  have  the  same  frequency.  It  is  clear  at  the  same 
time  that  superposition  of  oscillations  occurring  with  different  fre- 
quencies produces  a total  current  of  a complicated  nature  that  will 
not  vary  by  the  harmonic  law. 

Still  more  pictorial  is  the  differentiation  of  a complex  current. 
From  (9)  we  get 

— = uo  = zgj  J (10) 


By  virtue  of  Sec.  5.1,  this  multiplication  reduces  to  a co-fold 
stretching  of  the  vector  J and  counterclockwise  rotation  of  it 

through  90°.  Which  means  the  vector  ~ is  also  in  uniform  ro- 
tation with  velocity  co.  Similarly,  dropping  the  arbitrary  constant, 
we  get 

i(Gli  + a)  r r 

= = (11) 

ZOO  Z6>  CO 

This  vector  is  obtained  from  / by  a co-fold  compression  and  a 
90°  rotation  in  the  negative  direction.  The  position  of  these  vectors 
at  t = 0 is  shown  in  Fig.  53  on  what  is  known  as  a phase  diagram . 
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Fig,  53 


These  results  can  be  applied  to  working  out  oscillations  in  an 
electric  circuit  containing  any  combinations  of  resistors,  inductors, 
and  capacitors.  For  this  purpose  we  take  advantage  of  Fig.  54, 
which  shows  the  relationship  between  the  voltage  drop  on  a circuit 
element  and  the  current  flow  in  that  element.  (The  appropriate 
formulas  are  derived  in  electricity  theory  courses;  see  also  HM, 
Sec.  8.1.)  Also,  use  is  made  of  the  Kir chhoff  laws,  which  state  that 
the  algebraic  sum  of  all  currents  flowing  toward  any  point  in  a 
network  is  equal  to  zero ; and  the  algebraic  sum  of  the  voltage  drops 
on  any  sequence  of  elements  forming  a closed  circuit  is  equal  to 
zero.  These  laws  are  needed  to  form  complicated,  extended  net- 
works, since  for  simple  circuits  they  do  not  offer  anything  beyond 
the  obvious  initial  relationships. 

Consider,  for  example,  the  RL  circuit  shown  in  Fig.  55.  It  has 
a voltage  source  varying  by  the  harmonic  law 

? = <Po sin  M + f>)  (12) 

This  gives  rise  to  a current  that  varies  by  the  harmonic  law  (8), 
though  j0  and  a are  not  known  beforehand.  Equating  9 to  the  sum 
of  the  voltage  drops  on  R and  L on  the  basis  of  the  formulas  of 
Fig.  54,  we  get  (here,  in  implicit  form,  we  make  use  of  both  the 
Kirchhoff  laws) 

t T d3 
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Resistance 
R - 


Inductance 
L 


(p  = i — 

^ Ldt 


Capacitance 

— II5- 


Passing  to  the  complex  current  (9)  and  the  complex  voltage. 


we  get 


or,  using  (10), 
whence 


<I>  = 9„c 


:i(wH  3) 


RJ-\-L—  = Q> 

<lt 


RJ  + itoLJ  = <D 


J 


<I> 

7?  ZCO  /y 


?0* 


*'0 


# -J-  z'co L 


eioii 


(13) 

(H) 


We  see  that  the  inductance  L may  be  interpreted  as  a resistance 
numerical^  equal  to  iu>L ; it  is  called  the  impedance  of  element  L . 
Now,  to  obtain  the  desired  current,  it  remains  to  take  the  imaginary 
part  of  the  last  expression.  It  is  simpler,  however,  to  write  the 
m e® 

coefficient  ™ ■ in  the  exponential  form  yoeia,  which  produces 

the  desired  values  j0  and  a at  once.  From  this  we  see  that 

j _ ___  9o 

y°  R + zcoL  Vi?2  + <o2£2  * 

<p0e^  ~ , 1 ~ . R — iwL 

“ = = P + are-Fi^r  = P + arg  wt^b 

= P + arg  (R  — uoi) 

(here  we  make  use  of  the  fact  that  the  argument  arg  of  a complex 
number  remains  unchanged  when  this  number  is  multiplied  by  a 
positive  integer).  Hence  the  phase  of  the  current  in  an  RL  circuit  is 
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delayed  relative  to  the  phase  of  the  voltage  source,  which  is  of  course 
due  to  the  presence  of  an  inductance. 

The  desired  current  can  also  be  obtained  with  the  aid  of  the 
geometrical  construction  shown  in  Fig.  56.  Suppose  for  a moment 
that  we  have  the  current  (8)  and  seek  the  voltage  (12).  Then  it  is 
easy  to  construct  the  vector  /|/=o  = j0ei<x  and,  together  with  it,  the 
mutually  perpendicular  vectors  RJ  \t^o  and  iu>LJ\t=o.  But  by  virtue 
of  (H)  the  vector  <£>  |f=0  = <?0e is  a diagonal  of  a rectangle  constructed 
on  the  last  two  vectors,  which  means  that  the  last  vector  is 
readily  constructed. 

Now  let  us  return  to  the  original  problem  of  finding  the  current 
from  the  voltage.  Depict  the  vector  O |*=o  = <poet0  °f  the  outer  complex 
voltage  at  time  t = 0 (its  modulus  <p0  and  the  argument  S are  given). 
This  vector  must  serve  as  a diagonal  of  the  rectangle  constructed 
on  the  vectors  RJ\t=o  and  iuLJ  \t=o,  but  vector  J\t=a  is  not  given, 
it  is  sought.  But  this  rectangle  is  similar  to  the  rectangle  constructed 
on  the  vectors  R and  iu>L  (the  smaller  rectangle  in  Fig.  56),  which 
means  we  have  to  proceed  as  follows:  construct  the  latter  rectangle, 
then  increase  or  decrease  it  (observing  similarity  relations)  so  that 
its  diagonal  is  equal  to  <p0  , then  rotate  the  diagonal  to  coincidence  with 
the  vector  O |*=o ; it  will  then  have  the  position  of  the  large  rect- 
angle in  Fig.  56.  Finally,  divide  the  side  R J |<=o  by  R to  get  J |f=o. 

Now  let  us  consider  the  LC  circuit  shown  in  Fig.  57.  Here, 
instead  of  (13),  we  get 

L^l  + l[jdt  = <D 

dt  C J 


whence,  using  (10)  and  (11),  we  get 

i<x>L J — i — = O 
toC 

that  is, 

j _ O _ 


Z.fto2 — ) 

{ cjCJ  { LC) 


e^e^i 


(15) 


|Thus,  the  impedance  of  the  capacitor  is  the  quantity  — — 

Different  cases  are  possible  here.  If  the  frequency  co  of  the  external 

voltage  source  is  sufficiently  great,  to  be  more  exact,  if  u2  > — , 

re 

then  the  bracketed  quantity  in  the  denominator  is  positive.  Repre- 
senting  — i = e 2 , we  can  write 
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and  the  solution  (15)  is  not  suitable,  since  there  is  a zero  in  the  deno- 
minator on  the  right.  In  this  case,  for  all  ;0  and  a the  expression 
(9)  satisfies  the  relation 

LTt+7\Jdt  = i(*LJ~il?  = -[(*i-h]j  = 0 

at  C J coC  to  t Z.C ) 

which  is  to  say  that  in  a circuit  without  an  external  voltage  source 
(when  the  voltage  source  is  short-circuited),  undamped  harmonic 
oscillations  via  law  (8)  are  possible.  Such  oscillations  that  occur 
in  the  absence  of  any  external  source  are  called  free,  or  natural , 
oscillations.  Thus,  if  the  frequency  of  the  external  voltage  satis- 
fies the  relation  (16),  then  it  is  equal  to  the  frequency  of  the  natural 
undamped  oscillations  in  the  circuit.  Under  these  conditions,  as 
we  know  from  physics,  resonance  sets  in  and  instead  of  the  periodic 
harmonic  oscillations  that  we  are  considering  here,  there  appear 
oscillations  with  increasing  amplitude.  Resonance  will  be  examined 
in  Ch.  7. 

Bear  in  mind  that  this  method  enables  one  to  obtain  the 
current  that  is  set  up  in  a circuit  after  a certain  “transitory  period”. 
The  transient  process  is  also  described  by  the  methods  given  in  Ch.  7. 

Exercises 

1.  Consider  a series-connected  circuit  with  resistance  R , inductance 
L and  capacitance  C.  What  occurs  in  the  particular  case 

where  co2  = — • ? 

LC 

2.  Consider  a parallel-connected  circuit  with  resistance  R and 
inductance  L. 

5.6  The  derivative  of  a function  of  a complex  variable 

The  variable  w — f{z)  is  called  a function  of  the  complex  number 
2 if  to  each  value  of  £ there  is  associated  a definite  value  of  f{z). 
Since  z = x + iy,  where  x is  the  real  part  and  y is  the  imaginary 
part,  it  follows  that  specification  of  z signifies  specification  of  two 
real  numbers  ^ and  y.  Here,  f(z)  = u(x,  y ) + iv(x,  y),  where  u(x,  y) 
and  v(x,  y)  are  real  functions.  To  each  z there  correspond  definite 
x and  y and,  hence,  definite  u and  v and,  consequently,  a definite 
value  of  /(z).  However,  we  will  now  consider  as  a function  f{z)  not 
every  expression  u(x,  y)  + iv(x9  y)  but  only  such  a quantity  as 
depends  on  z via  such  formulas  as,  say,  f(z)  = 1 + z2,  f(z)  = z, 
f(z)  = ez,  f(z)  — sin  z,  and  so  on.  These  formulas  can  involve  alge- 
braic operations  or  nonalgebraic  operations  on  zt  but  they  must 
be  such  as  can  be  expressed  with  the  aid  of  a Taylor  series  in  z, 
for  example,  ez,  sin  z,  and  so  forth. 

We  thus  consider  formulas  involving  z but  not  its  real  or  ima- 
ginary part  separately.  (From  this  viewpoint,  z*  = x — iy  or  | z | 
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are  not  regarded  as  functions  of  z,  although  if  vve  know  z,  it  is  easy 
to  find  2*  and  \z\.)  With  this  definition  of  f(z),  all  functions  f(z) 
have  one  property  in  common : we  can  find  the  derivative 

/'(*)  = f 

ctz 

by  the  ordinary  rules  for  differentiating  functions.  A function  f(z) 
that  has  a derivative  is  called  an  analytic  function.  We  are  thus  going 
to  consider  only  analytic  functions. 

For  functions  given  by  simple  formulas  the  computation  of  deri- 
vatives is  no  more  complicated  than  for  functions  of  a real  variable. 
Consider,  say,  the  function  w = z2.  Giving  * the  increment  A z,  we 
get  the  increment  of  the  function: 

Aw  = (z  + A z)2  — z2  = 2z  Az  + (Az)2 

whence,  passing  to  differentials  and  dropping  (the  moduli  of)  higher 
infinitesimals,  we  get 

dw  — 2 z dzy  — = 2 zt  that  is,  ^ = 2 z 
dz  dz 

It  is  thus  clear  that  in  these  computations  it  is  immaterial  whether 
the  independent  variable  assumes  complex  or  only  real  values. 

However  in  the  general  case  of  a complex  variable,  the  existence 
of  a derivative  is  not  at  all  so  obvious  and  simple  as  in  the  case  of 
a real  variable.  Indeed,  z = x + iy  and  we  can  give  the  increment 
dz  by  changing  x and  y at  the  same  time : dz  = dx  + i dy ; we  can 
give  the  increment  dz  by  changing  only  x:  dz  = dx;  finally,  we  can 
change  only  y:  dz  = i dy.  If  the  derivative  f'(z)  exists,  then  this 
means  that  for  distinct  modes  of  changing  z the  corresponding  chan- 
ges in  / are  such  that  — are  the  same  in  all  cases. 

dz 

Write  f(z)  thus:  f(z)=u(x,y)  + iv(x,  y).  Suppose  only  x 
changes ; then  dz  = dx  and 

?LdX 

df  dx  du  . . dv 

— = = — + 1 — 

dz  dx  dx  dx 

If  it  is  only  y that  is  changed,  then  dz  = i dy,  and  so 

Kdy  [ *L+i*L)dy 

df  dy  t dy  dy ) 1 du  ^ dv  ^ du  ^ dv 

dz  i dy  i dy  i dy  dy  dy  dy 

Equating  the  two  expressions  for  the  derivative  obtained  by 
different  methods,  we  get 

du  ^ j dv  j du  ^ dv 

dx  dx  dy  ' dy 
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These  formulas  are  called  the  Cauchy- Ricmann  conditions.  Thus, 
the  real  and  imaginary  parts  of  an  analytic  function  f(z)  are  connected 
by  definite  relations. 

Consider  the  following  example.  Suppose 

f(z)  = z-  = (x  + iy)2  = x2  — y2  + i • 2 xy 
Here,  u = x-  — y2,  v = 2 xy, 

du  dv  2X  dv  d u 

dx  dy  dx  dy 

And  the  Cauchy-Riemann  conditions  (17)  hold. 

Consider  an  example  of  an  opposite  nature.  Let  f(z)  — z*  = 
= x — iy , where  z = x + iy.  We  noted  above  that  although  such 
a function  f(z ) may  be,  in  a certain  sense,  called  a function  of  z 
(to  every  z there  corresponds  a definite  z*},  it  is  not  an  analytic  func- 
tion, since  u = x,  v — — y and  so 


du  j dv  Q du  Q dv  j 

dx  * dx  dy  * dy 

and  the  first  of  the  Cauchy-Riemann  conditions  does  not  hold  true. 
Exercise 

Verify  the  Cauchy-Riemann  conditions  for  the  function  f(z)  = z3. 

5.7  Harmonic  functions 

Let  us  return  to  analytic  functions.  We  take  the  derivatives  with 
respect  to  a;  of  the  first  equation  in  (17)  and  with  respect  to  y of 
the  second  to  get 

d2u  d2v  d2v  d2u 

= > = 

dx2  dx  dy,  dx  dy  dy2 

whence 

TF  + T7=°  (1S) 

dx2  dy2 


Similarly,  differentiating  the  first  equation  in  (17)  with  respect  to 
y and  the  second  with  respect  to  x,  we  get  = 0.  Equation 

. dx2  dy2 

(18)  is  called  the  Laplace  equation . By  specifying  different  f(z),  we 
can  obtain  distinct  solutions  of  this  equation,  which  are  called 
harmonic  functions . 

To  summarize,  then,  for  the  real  and  imaginary  parts  of  an  ana- 
lytic function  we  cannot  take  just  any  functions  u(x,  y)  and  v(x,  y) 
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but  only  harmonic  functions.  Also,  these  functions  must  be  connected 
by  the  relations  (17);  such  functions  are  called  conjugate  harmonic 
functions.  It  can  be  verified  that  if  one  of  two  conjugate  functions 
is  chosen,  then  by  virtue  of  these  relations  the  other  one  is  complet- 
ely defined  up  to  an  arbitrary  additive  constant.  Let  the  function 
v be  chosen  and  let  the  relations  (17)  be  satisfied  by  two  functions: 

u = ux  and  u = u9.  Then  — 1 = — ! , since  both  the  derivatives  are 

1 2 dx  dx 

equal  to  — , and  so  — — — 0,  which  is  to  say  that  ux  — u2  does 

By  dx 

not  depend  on  x.  Similarly  we  obtain  — — — = 0 that  is,  the 

dy 

difference  ux  — u2  does  not  depend  on  y either,  and  so  it  is  a constant. 

The  Laplace  equation  is  of  great  importance  in  mathematical 
physics.  For  example,  the  electrostatic  potential  in  the  space  between 
long  conductors  stretching  perpendicular  to  the  #y-plane  is  a func- 
tion of  x , y alone  and  satisfies  equation  (18).  At  points  of  the  ay- 
plane  where  it  is  punctured  by  the  conductors,  the  Laplace  equation 
for  the  potential  fails.  In  particular,  the  potential  has  a singularity 
(becomes  infinite)  where  the  plane  is  intersected  by  a conductor  whose 
cross-section  is  of  an  infinitely  small  diameter.  Thus,  to  put  it  more 
precisely,  we  must  say  that  the  potential  is  a harmonic  function  in 
that  portion  of  the  plane  where  there  are  no  charges.  This  will  be 
discussed  in  more  detail  in  Sec.  10.5. 

The  relationship  between  u and  v has  a simple  geometric  meaning. 

Construct  the  line  u(x,  y)  = constant.  Since  — dx  + — dy  = 0, 

dx  By 

it  follows  that  the  slope  of  this  line  to  the  #-axis  is 


tan  oq 


dy 


dx 


M=const. 


da  j du 
dx  j dy 


Similarly,  the  slope  of  the  line  v(x,  y)  = constant  to 


tan  a2  = 


dy 

dx 


v=const. 


dv  I dv 
dx  j dy 


the  #-axis  is 


Using  the  Cauchy-Riemann  conditions,  we  get 


tan  a*  = — 
By 


du 

dx 


1 

tan  oq 


(19) 


What  we  have  is  the  condition  of  perpendicularity  of  tangent  lines 
to  the  lines  u(x,  y)  = constant  and  v(x,  y)  = constant.  * 


For  perpendicularity,  it  must  be  true  that  a2  = oq  ± 90°, 


tan  a2  — tan  (oq  i 90°)  = — cot  oq  = 
which  is  the  condition  (19). 


1 

tan  oq 


i.e. 
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Fig.  58 


u=constant 


Thus,  taking  any  analytic  function  f(z)  — u(x,  y)  + iv(x , y), 
we  get  two  families  of  curves  intersecting  at  right  angles  at  every 
point.  These  curves  are  shown  in  Fig.  58  for  the  case  f(z)  = z 2 given 
above. 

If  u is  the  electrostatic  potential,  then  u(x,  y)  = constant  repre- 
sents the  lines  of  constant  potential  and  v(x # y)  — constant,  the  lines 
of  field  intensity  (force  lines) ; at  every  point,  the  line  of  field  intensity 
is  along  the  normal  to  the  line  u(x , y)  = constant  passing  through 
this  point. 

Exercises 

1.  Let  f(z)  = u(x,  y)  + iv(x,  y).  Knowing  that  u(x,  y)  ~ x + — - 

— y2,  determine  v(x,  y)  if  /( 0)  = 0. 

2.  Knowing  that  v(xt  y)  — —2  xy,  /( 0)  = 1,  determine  u(x,  y). 

5.8  The  integral  of  a function  of  a complex  variable 

We  now  define  the  integral  of  the  complex  function  f(z).  Take 
two  points  — the  initial  point  zmit  and  the  terminal  point  zicr  — in 
the  plane  and  join  them  with  some  curve  (/)  (Fig.  59).  Partition 
this  curve  into  p subsections  and  number  the  subdivision  points 
thus: 

-?init  “ %Qt  Z\j  •••»  Mcr  ~ %p 
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Fig.  59 


Form  the  sum 

s =£/&)•(*,-  Vi)  (2°) 

j- 1 

where  the  point  ^ is  chosen  arbitrarily  on  a subsection  of  the  line 
(/)  between  and  zy  The  reader  should  bear  in  mind  that  the  va- 
lues of  the  function  /(^)  and  the  quantities  zj  — zj_1  are  complex 
numbers,  and  so  when  forming  the  sum  (20)  we  perform  operations 
with  complex  numbers,  and  the  result,  that  is  5,  is  also  a complex 
number. 

We  use  the  term  integral  for  the  sum  5,  provided  that  the  line 
(/)  is  subdivided  into  such  small  subsections  that  any  further  refine- 
ment in  the  partition  does  not  alter  the  sum  S (to  put  this  more  pre- 
cisely, the  limit  of  the  sum  for  an  infinite  refinement  of  the  partition 
of  the  curve).  We  denote  the  integral  thus: 

2ter 

/=$  Md~~ 

*init 

From  the  definition  it  follows  that  the  integral  is  multiplied  by  — 1 
when  the  direction  of  integration  is  reversed,  for  then  all  differences 
zj  — z^lt  and  therefore  the  sum  (20)  as  well,  is  multiplied  by  — 1. 

The  following  question  arises.  It  is  clear  that  the  points  zinit 
and  Zter  may  be  joined  in  a variety  of  ways  (Fig.  60).  In  forming 
the  sum  (20)  for  different  curves  joining  the  points  zini t and  zter  we 
will  have  to  do  with  distinct  values  of  /(^)  and  zf  — zj_v  Does  the 
integral  I depend  on  the  choice  of  path  or  does  it  depend  only  on 
the  initial  (zia it)  and  terminal  (*ter)  points? 
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Fig.  60 


o 


X 


It  turns  out  that  if  f(z)  is  an  analytic  function  and  if,  in  a region 
bounded  by  different  paths,  f(z)  never  becomes  infinite,  then  the 
integral  does  not  depend  on  the  choice  of  path.  * 

To  prove  this,  denote  the  closed  contour  z-mitAzteiBzinit  by  the 
letter  (L)  and  the  integral  ^f(z)  dz  by  the  letter  /(the  symbol 
denotes  an  integral  over  a closed  contour).  Since 

/=  5 f{z)dz+  5 mdz 

"init^ter  *ter^*init 

= S Az)dz~  S f{z)dz 

*init^*ter  *init^zter 

it  suffices  to  verify  that  7 = 0.  Conversely,  if  integrals  of  f(z) 
over  the  contours  zinitAzter  and  ziTntBzttT  are  equal,  then  1 = 0.  This 
means  that  the  fact  that  the  integral  of  an  analytic  function  is  inde- 
pendent of  the  path  of  integration  is  equivalent  to  the  following 

assertion,  which  is  known  as  Cauchy  s integral  theorem:  The  integ- 
ral, extended  over  a closed  curve,  of  a function  that  is  analytic 
throughout  the  interior  of  the  curve  (contour)  and  on  it  is  equal 
to  zero. 

To  prove  Cauchy's  integral  theorem,  we  partition  the  portion 
of  the  plane  bounded  by  the  contour  (L)  into  small  squares  with 
contours  (Lk)  and  traverse  each  contour  counterclockwise,  which 


Besides  the  condition  that  f(z)  not  go  to  infinity,  it  is  also  necessary  that  f(z) 
be  a single-valued  function  or,  at  least,  that  when  moving  from  one  path  to 
another  we  always  make  use  of  one  branch  of  the  function  f(z)  (see  the  end  of 
this  section). 
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Fig.  61 


is  the  sense  of  traversal  of  (L).  (Fig.  61  depicts  one  of  the  contours.) 
Then 

*)&«£  f{z)dz  (21) 

k (Lk) 

since  in  the  right  member  all  the  integrals  around  the  inner  sides 
of  the  squares  cancel  out.  If  within  (Lk)  we  take  a point  zk,  then  on 
(Lk)  we  have 

= H */'W.  or  + « 

2 “ Zk  A*  Z — Zk 

where  a is  an  infinitesimal  of  the  order  of  the  length  h of  a side  of 
a small  square.  From  this  we  have 

§/(*) dz  = § [/(**)  +/'(**)  (*  - + «(*  - **)]  ^ 

(**)  (i*) 

The  integral  of  the  first  two  terms  is  easily  taken;  and  since  the 
integration  is  performed  over  a closed  curve,  it  is  equal  to  zero. 
This  leaves  only  the  integral  of  the  third  term,  which  has  the  order  A3, 
since  the  length  of  the  contour  of  integration  is  of  the  order  of  h 
and  the  factor  z — zk  has  the  same  order.  But  in  the  right-hand 

member  of  (21)  the  number  of  terms  in  the  sum  has  the  order  — 
v 7 h2 

and  thus  the  entire  sum  has  the  order  h.  This  means  that  the  right 
side  tends  to  zero,  as  h ->  0,  when  the  squares  become  infinitely 
small.  But  it  must  be  equal  to  the  constant  left  member,  that  is, 
this  constant  is  equal  to  zero,  which  is  what  we  set  out  to  prove. 


?/<■ 


188 


CH.  5 


Functions  of  a complex  variable 

Later  on  we  will  give  a different  proof  of  this  important  theorem 
via  the  methods  of  vector  analysis  (see  Exercise  2 of  Sec.  11.7). 

Z 

Thus,  for  zin u fixed,  ^ f(z)  dz  depends  only  on  the  terminal 

~init 

point  2 of  the  path,  which  is  to  say,  it  is  a function  of  z.  Let  us  denote 

z 

this  function  by  O (z),  then  ^ /(z)  dz  = 0(z). 

~init 

Let  us  find  the  derivative  of  this  function: 

Z+dz  z z-hdz 

5 m dz  - J m dz  $ /(z)dz 

d<b(z)  _ a>(z  + dz)  - _ *init 2init  _ _J_ 

dz  dz  dz  dz 


z-\  - dz 

Consider  C /(z)  dz.  Since  the  numbers  z and  z + dz  are  almost  the 


same  and  f(z)  is  a continuous  function,  /(z)  hardly  changes  at  all  as 

z^-dz 

z varies  within  these  limits.  Therefore,  ^ /(z)  dz  /(z)  (z  + dz  — z)  = 

z 

/(z)  dz  and  this  equation  can  be  made  as  exact  as  desired  by 
decreasing  dz.  Hence 


d${z) 

dz 


Mdz 

dz 


= /M 


(22) 


Formula  (22)  shows  that  the  relationship  between  the  inte- 
grand function  f(z)  and  the  integral  O(z)  remains  the  same  as  in  the 
case  of  a function  of  a real  variable. 

Also,  we  will  show  that  the  ordinary  formula  for  computing  an 
integral  is  retained: 

2ter 

\ f(z)dz=<t>(zter)-^(zinit)  (23) 

zinit 

where  fl)(z)  is  any  function  satisfying  relation  (22). 

z 

Indeed,  now  ^ /(z)  dz  = O(z)  + C,  where  C is  a constant. 

2init 

Here,  setting  z = 2jnit,  we  get  0 = 0(^init)  -f  C whence  C = — <t>(2init). 

Z 

And  so  C f(z)  dz  = <t>(z)  — 0(2init).  Putting  z = ztet,  we  get  (23). 
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The  formulas  (22)  and  (23)  show  that  all  the  rules  for  finding  inte- 
grals that  hold  for  ordinary  real  integrals  (see,  for  example,  HM, 
Ch.  3)  are  also  applicable  to  integrals  of  complex  functions. 

In  the  theory  of  analytic  functions  we  often  encounter  multiple- 
valued functions.  Consider  for  instance  the  function  w = ]/  z.  In 
Sec.  5.4  we  showed  that  this  function  has  two  values:  if  = re**, 
then 


1/-  4 i/-  »(y+«) 

= \re  2 , w2  = \re  v 2 ' 

If  we  choose  one  of  these  values  (or,  as  we  say,  one  branch  of 
the  function),  we  have  the  following  interesting  property.  Let  z 
go  round  the  point  z = 0 in  a positive  sense  * and  return  to  the 
original  position.  Then  to  9 is  added  27t  and  the  value  wx  becomes 

•<P+  2rc  ./  <p  ^ 

^ re  2 = j Ire  ' 2 nJ  = w2 

In  the  same  way,  w2  will  then  go  to 

(<p+2jt,  \ . <p  . 

2 71 ) _ ]jre  2 . e2ni  = yye  2 . J — 

Thus,  the  two  branches  of  the  function  w = ]/z  continue  usly  pass 
one  into  the  other  in  circling  the  point  2 = 0,  and  if  the  point  z = 0 
is  encircled  twice,  we  return  to  the  original  branch. 

The  point  z = z0 , which  when  encircled  has  one  branch  of  a 
multiple-valued  function  replacing  another,  is  called  a branch  point . 
Thus,  for  a function  w = ^ z the  point  z = 0 is  a branch  point  of 
the  second  order  (since  there  are  two  branches).  The  “infinite  point” 
z = 00  is  generally  said  to  be  another  branch  point  of  this  function. 
More  than  two  branches  can  alternate  at  a branch  point ; for  example> 
the  function  w = fz  has  n branches  that  continuously  replace  one 
another  in  circular  order  when  encircling  the  branch  point  2 = 0. 

Another  important  example  of  a multivalued  function  is  the 
function  w — In  2.  In  Sec.  5.4  we  saw  that  this  function  has  an 
infinite  number  of  values:  wk  = In  r + i (9  + 2&tt)  (k  = 0,  ±1, 
± 2,  ...).  If  the  point  2 encircles  the  origin  and  2n  is  added  to  9, 
then  the  value  w0  goes  into  wx,  the  value  goes  into  w2,  and  so  on. 
If  we  again  encircle  the  origin,  we  will  pass  to  ever  new  branches 
and  will  never  return  to  the  original  branch.  Such  a branch  point  is 
termed  a branch  point  of  infinite  order. 

In  order  to  be  able  to  consider  a single  branch  independently 
of  any  other  one,  it  is  necessary  in  some  way  to  prohibit  the  point  z 
from  making  circuits  about  the  branch  points  of  the  function  at 


We  assume  the  direction  of  traversal  to  be  positive  if  the  point  z = 0 always 
remains  on  the  left  during  the  traversal. 
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hand.  Ordinarily,  to  do  this,  one  or  several  lines,  called  branch  cuts , 
are  drawn  in  the  plane  joining  the  branch  points ; these  cuts  cannot 
be  crossed.  For  example,  in  considering  the  function  w = ]fz  we  can 
draw  a branch  cut  along  the  positive  real  axis  from  the  point  z — 0 
to  infinity.  If  the  point  z varies  in  arbitrary  fashion  in  the  plane 
outside  this  cut,  it  cannot  make  a circuit  about  the  branch  point 
~ = 0 and  therefore  that  one  branch  cannot  give  way  to  another. 
After  such  a cut  has  been  made,  each  branch  may  be  regarded  as  a 
single-valued  analytic  function  (although  the  branch  has  a discon- 
tinuity along  the  cut,  it  — the  branch  — takes  on  different  values 
on  different  sides  of  the  cut).  In  particular,  we  can  apply  the  Cauchy 
integral  theorem  to  such  a branch. 

From  now  on,  for  integrands  we  will  consider  only  single-valued 
analytic  functions.  Incidentally,  in  Sec.  5.9  we  will  see  that  this 
does  not  save  us  from  the  necessity  of  considering  multivalued  func- 
tions that  result  from  integration. 

Exercise 

Find  the  integrals  of  the  following  functions  over  the  upper 
and  lower  semicircles  going  from  z = — 1 to  z — 1 with  centre 

at  the  point  z = 0:  (a)  z2,  (b)  — , (c)  J/z  (for  the  branch  equal 

to  i when  z = — 1).  Explain  the  coincidence  of  results  in  (a) 
and  the  noncoincidence  in  (b)  and  (c). 

5.9  Residues 

To  summarize:  if  an  integral  is  independent  of  a path,  then  an 
integral  over  a closed  circuit  is  equal  to  zero.  Above  we  noted 
that  an  integral  is  not  dependent  on  the  path  if  the  integrand 
does  not  become  infinite.  Let  us  consider  an  example  in  which 
the  integrand  becomes  infinite. 

Let  /=<£>—.  Here,  f{z)  = — becomes  infinite  when  z = 0. 

j z z 

We  evaluate  the  integral  over  the  closed  path  that  makes  a circuit 
of  the  point  2 = 0 in  the  positive  direction  (counterclockwise),  say, 
around  a circle  (C)  of  radius  r with  centre  at  the  coordinate  origin 
(Fig.  62).  On  this  circle,  z — re**,  where  r is  the  radius  of  the  circle 
and  the  variable  <p  ranges  from  0 to  2-,  Then  dz  = rex*  id  9 and 

2n  i 9 

-e-  = 2ni.  The  integral  round  the  closed  circuit  turned 

re1*  & 

c 0 

out  not  equal  to  zero. 

We  know  that  ^ — = In  z.  The  fact  that  the  integral  ^ — round 
the  closed  circuit  is  not  equal  to  zero  is  in  remarkable  agreement 
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Fig.  63 


with  the  multivalued  nature  of  the  function  In  z.  Consider,  for 

2 

C dz 

example,  / = V — . Going  from  z = 1 to  z = 2 via  the  shortest 

i 

route  (Fig.  63a),  we  find  I = In  2 — In  1 = In  2 = 0.69.  If  we  take 
a longer  route:  first  one  circuit  about  the  origin  and  then  to  the 
desired  point  (Fig.  636),  we  get  I1  = 2 ni  + 0.69.  If  we  make  n 
circuits  about  the  origin,  then  we  have  In  = 2t zi-  n + 0.69. 

In  Sec.  5.4  we  found  out  that,  true  enough,  the  quantities 
2ni  • n + 0.69  for  all  integral  n serve  as  the  logarithms  of  the  number  2, 
since  e2ni'n  = 1.  Thus,  the  multivalued  nature  of  the  logarithm  is 
the  result  of  a choice  of  distinct  routes  with  different  numbers  of 
circuits  about  the  point  2 = 0,  where  1 jz  becomes  infinite. 
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The  value  of  the  integral  depends  on  how  many  times  and 
in  which  direction  we  made  the  circuit  of  the  origin,  but  does  not 
depend  on  the  path  traversed  in  any  circuit.  We  now  prove  the 

last  assertion.  Let  us  try  to  find  I A = ^ dzjz  around  some  path 

ABA  (Fig.  6 4a).  Consider  the  integral  70  = ^ — around  the  path 

shown  in  Fig.  64b.  This  path  consists  of  ABA,  two  close-lying  straight 
lines  AC  and  CA  and  a circle  of  radius  OC  centred  at  the  origin. 
70  = 0,  since  this  is  an  integral  around  a closed  circuit  inside  which 
1 jz  does  not  become  infinite  anywhere. 

A C 

The  integral  70  consists  of  I A,  two  integrals  ^ dzjz  and  ^ dz[z 

C A 

that  mutually  cancel  out,  and  the  integral  around  the  circle  of 
radius  OC.  Since  the  integration  around  the  circle  is  in  a direction 
opposite  to  that  in  which  the  angles  are  reckoned,  the  appropriate 
integral  is  equal  to  — 2n i.  Therefore  70  = 0 = IA  — 2rA  or  1 A = 2tt i, 
which  means  that  IA  coincides  with  the  value  of  the  integral  around 
a circle  of  arbitrary  radius. 

By  similar  means  it  is  possible  to  reduce  integrals  over  curves 
that  make  calculation  awkward  to  integrals  around  small  circles 
about  points  that  make  the  integrand  go  to  infinity.  Here,  one  should 
not  think  that  the  integral  must  definitely  be  different  from  zero. 
For  example,  in  the  integral 

§ ^ dz  («*  = 2,  3,  4,  ...)  (24) 

the  integrand  has  a singularity  (it  becomes  infinite)  at  z — 0.  However, 
this  integral  is  equal  to  zero  around  any  closed  circuit  whether  it 
encloses  the  point  or  not  (but  does  not  pass  through  the  point!). 
Indeed,  in  the  given  example,  the  indefinite  integral  is  equal 
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,-m+i 

to  — + C,  which  means  it  is  a single-valued  function ; now 

— W -f  1 

the  increment  of  a single-valued  function  around  a closed  circuit 
is  equal  to  zero  (why?). 

For  any  mt  the  integral  (24)  around  the  circle  ]z  \ = r can  be 
evaluated  in  the  following  manner.  Set  z = re **,  then  a few  simple 
manipulation  brings  the  integral  to  the  form 

2n 

0 


A straightforward  evaluation  shows  that  it  is  equal  to  zero  for  any 
integral  m -=f=.  1.  We  excluded  the  case  of  a nonintegral  mt  because 
then  the  integrand  is  a multivalued  function. 

Let  us  consider  another  example.  Suppose  we  have  to  evaluate 
the  integral 


s 


(25) 


around  a closed  circuit  about  the  origin  z = 0 in  the  positive  sense ; 
in  this  example,  the  origin  is  a singular  point  for  the  integrand  because 
this  function  becomes  infinite  at  the  origin. 

Recalling  the  expansion  of  the  function  cos  £ about  the  point 
z = 0 in  a Taylor  power  series, 


1 z2  z* 

cos  z — 1 

2!  4! 


jrC 

6! 


+ ... 


we  can  write 


COS  Z 1 1 1 

~~  l\z  + 


4! 


(26) 


In  this  example,  the  integrand  tends  to  infinity  at  the  rate 

of  — — as  £ — > 0.  Such  a singular  point  is  termed  a pole  of  the 

i z I3 

third  order. 

To  evaluate  the  integral  (25),  carry  out  a term-by-term  inte- 
gration of  the  series  (26).  The  indefinite  integral  of  any  term,  except 
the  second,  yields  a single-valued  function  (a  power  with  integral 
exponent)  and  so  the  corresponding  integral  around  a closed  circuit 
is  equal  to  zero.  (For  one  thing,  the  integral  of  the  first,  principal, 
term  of  the  expansion  (26)  is  equal  to  zero.)  By  virtue  of  the  fore- 
going, the  integral  of  the  second  term  is 


Hence  in  this  example  the  whole  integral  (25)  is  equal  to  — ni. 


13  - 1634 
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Now  let  us  consider  a pole  of  general  form.  If  a (single-valued) 
function  f(z)  has  a pole  of  order  n at  some  point  z = a,  then,  about 
this  point,  it  can  be  expanded  in  what  is  known  as  a Laurent  series: 
f{z)  = c_n(z  — a)~n  + c_n+l(z  — a)~n+1  + ...  -f-  c_fz  — a)"1 

+ c0  + cfz  - a)  + c2(z-  a)2  + ...  - - c~”  + C~'1+‘-  + 

(z  — a)n  (z  — a)n  1 

...  + — — b c0  + cfz  — a)  + c2(z  — a)2  + ...  (27) 

z — a 

in  terms  of  positive  and  negative  integer  powers  of  z — a,  beginning 
with  the  nth  power.  Suppose  it  is  required  to  evaluate  the  integral 

§ f(z)  dz  (28) 

around  a contour  enclosing  the  point  z ~ a in  the  positive  direction 
and  not  containing  within  it  any  other  singular  points  except  this 
one.  As  we  have  mentioned,  it  is  possible  to  go  from  the  given  inte- 
gral to  an  integral  around  a small  circle  centred  at  a,  and  near  this 
point  we  can  take  advantage  of  the  expansion  (27).  As  in  the  pre- 
ceding example,  the  integrals  of  all  terms  become  zero  after  inte- 
gration around  the  closed  contour ; that  is,  with  the  exception  of 

^ — ~l  - dz  = 2nic_x.  That  is  the  value  of  the  whole  integral  (28).  The 

coefficient  of  the  (—  l)th  power  of  z — a in  the  Laurent  expan- 
sion has  a special  name:  the  residue  of  the  function  f(z)  at  the 
point  a.  Thus,  the  integral  (28)  is 

2tt i Res f_./(s)  (29) 

Now  let  it  be  required  to  evaluate  an  integral  of  type  (28)  around 
a certain  contour  (L)  (Fig.  65),  where  the  integrand  f(z)  is  single- 
valued and  analytic  everywhere  on  the  contour  (L)  and  within  it. 
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with  the  exception  of  a certain  number  of  singular  points.  (In  Fig.  65 
we  have  three  such  points:  av  a2t  and  az .)  We  draw  auxiliary  lines 
(shown  dashed  in  Fig.  65)  so  that  the  region  bounded  by  (L)  is  divided 
into  parts,  in  each  of  which  there  is  one  singular  point.  Denote 
the  contours  of  these  parts  that  are  traversed  in  the  positive  direction 
by  (Lj),  (L2),  and  (Z,3).  It  is  then  easy  to  verify  that 

^ f(z)  dz  = ^ f{z ) dz  + ^ f(z)  dz  + ^ f(z)  dz  (30) 


(r2) 


(r8) 


because  in  the  right-hand  member  the  integrals  taken  along  the 
auxiliary  lines  cancel  out.  Each  of  the  contours  (Z^),  (L2),  (Z,3)  con- 
tains within  it  only  one  singular  point  and  so  each  of  the  integrals 
on  the  right  of  (30)  is  evaluated  by  formula  (29),  and  we  get 

^ f(z)  dz  = 2mResz=atf(z)  + 2niRes,z=aJ(z)  -f  2niResz=aJ(z) 

= 2m[Resz=aJ{z)  + Res  Z=aj{z)  + Res2=a,/(*)]  (31) 

To  summarize:  the  integral  (28)  is  equal  to  the  product  of  2n i by 
the  sum  of  the  residues  of  the  integrand  at  all  singular  points  located 
inside  the  contour  of  integration. 

We  now  show  how  to  compute  a residue  for  the  important  case  of 
a pole  of  the  first  order.  A first-order  pole  ordinarily  results  if  the  inte- 
grand f(z)  is  a ratio  of  two  finite  functions:  f{z)  ~ g(z)jh(z)9  and  at 
some  point  z = a the  numerator  is  different  from  zero  and  the  deno- 
minator has  a zero  of  the  first  order,  that  is,  the  expansion  of  the 
denominator  in  powers  of  z — a begins  with  a first-degree  term. 
Writing  down  the  Taylor-series  expansion  of  the  numerator  and 
denominator  about  z = a,  we  get 

gifl)  + g'ip)  (*  — a)  + - (z  — a)2  + — 

fl?)  = 


h-[a)  (z-a)  + [z  - a)2  + 


g(“) 


Near  the  point  z = a we  can  replace  the  right  member  by  — 

For  this  reason,  the  residue,  that  is,  the  coefficient  of  (z  — a)  1,  is 
here  equal  to 

ReW(2)=f-  (32) 

h (a) 

Let  us  consider  an  example.  Suppose  it  is  required  to  compute 
the  integral 


z + 1 


dz 
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where  ( L ) is  a circle  of  radius  2 with  centre  at  the  coordinate  origin 
(Fig.  66).  In  this  case  the  indefinite  integral  cannot  be  expressed 
in  terms  of  elementary  functions,  yet  we  will  easily  find  the  integral 
around  the  closed  curve.  Note  that  the  integrand  has  singularities 
where  its  denominator  vanishes,  i.e.  for  2 = 1 and  z = kn  (k  any  inte- 
ger). Of  these  points  shown  in  Fig.  66,  only  two  are  inside  (L) : z = 0 
and  2=1.  Therefore,  by  (31), 

I = 2m[Res*=0/(2)  + Res*=i/(z)]  (33) 

Since  at  each  of  these  points  the  denominator  has  a first-order  zero 
and  the  numerator  is  nonzero,  we  have  two  poles  of  the  first  order 
and  the  residues  at  those  points  can  be  computed  from  formula  (32), 
In  the  given  example 

g(z)  =2+1,  h(z)  = (z  — 1)  sin  2,  h'(z)  = sin  2 + (2  — 1)  cos  2 
whence  Resz=of(z)  = - — — — — = —1,  Resz=i f(z)  = ---  = — - — 

h'{  0)  —1  h'{\)  sin  1 

Substituting  into  (33),  we  get 

I - 2m(- 1 + — + = 8.6 5i 
[ sin  1 J 

Residue  theory  is  also  used  to  compute  certain  real-valued 
integrals  with  the  aid  of  an  artificial  reduction  of  them  to  complex- 
-valued  ones.  Consider,  for  example,  the  integral 

ao 

T r cos  mx 

Il=  ) TT 7 dx  > °) 


— 00 


(34) 
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Here  again,  the  indefinite  integral  cannot  be  expressed  in  terms 
of  elementary  functions  so  that  computing  /,  by  any  standard  method 
is  out  of  the  question.  To  evaluate  Iv  first  of  all  note  that 


$ 


sin  to* 
1 + x2 


dx  = 


0 


(35) 


as  the  integral  of  any  odd  integrand  within  limits  that  are  symmetric 
about  zero.  From  (34)  and  (35)  it  follows  that 


dx  — 


Now  consider  the  auxiliary  integral 

r e^z 

" S 7X7-*  <37) 

( lr ) 

where  the  contour  (LR)  consists  of  the  semicircle  and  its  diameter 
shown  in  Fig.  67.  Since  the  integrand  has  poles  of  the  first  order 
at  the  points  z — ± h of  which  there  is  only  inside  (Lfl),  from  for- 
mulas (31)  and  (32)  we  get 

Jaz  Jai 

IR  = 2 t zi  Res s=i — 2 777 = 

R 1 + ^2  2 i 


On  the  other  hand,  the  integral  (37)  may  be  represented  as  a sum 
of  the  integral  along  the  segment  of  the  real  axis  equal  to 

R 

et0iX 

1 + *2 

— R 


(38) 


198 


CH.  5 


Functions  of  a complex  variable 


and  the  integral  around  the  semicircle,  which  we  denote  by  (Z,*). 
Let  us  estimate  this  latter  integral.  From  the  definition  of  an  integral 
it  is  easy  to  derive  the  following  estimate: 

dz  j ^ max  | f(z)  | • length  ( L ) 

! ' (L) 

Therefore 


TT7 


dz 


^ max- 
(L*> 


1 -j-  z2 


7 zR 


(39) 


Representing  z = x + iy  and  noting  that  y ^ 0 on  (Z^),  we  find, 
on  (Ljj),  | eioiZ  | = | | = | eiCiX  | | e~<*y  | = < 1.  On  the  other 

hand,  on  (L'R)  we  have 

i i+**i=l**(i +£)!=** 

(for  large  R — \z\).  Hence  the  right  member  and,  with  it,  the  left 

member  of  (39)  too  tend  to  zero  as  — R = — as  R oo. 
v 7 R2  R 

Summarizing,  we  see  that  IR  = ne-**  may  be  represented  as  the 
sum  of  the  integral  (38)  (which  in  the  limit  tends  to  (36),  that  is, 
to  Iv  as  R oo)  and  the  integral  around  (L^)  that  tends  to  zero. 
Passing  to  the  limit  as  R oo,  we  set  I1  = or 

00 

f cos  tox  j (Ar. v 

V dx  = (40) 

J 1 + *2  V 


1 + 


R2 


This  last  formula  is  not  obvious  in  the  least  and  to  obtain  it 
without  invoking  complex  numbers  is  extremely  difficult  and  requires 
consummate  skill.  With  the  aid  of  the  properties  of  integrals  of 
complex  functions  it  is  possible  to  find  many  integrals  of  that  kind 
of  real-valued  functions  via  standard  procedures  that  only  require 
care  and  accuracy. 

In  practice,  the  integral  just  considered  is  obtained  in  the  pro- 
blem of  exciting  oscillations  in  a system  with  a natural  frequency  co 

under  the  action  of  a force  varying  with  time  by  the  law  f(t)  = — 

l | t 

This  result  gives  the  law  of  diminishing  amplitude  of  the  oscillations 
being  excited  as  the  natural  frequency  to  of  a system  is  increased 
(see  Sec.  7.5). 

It  is  interesting  to  view  formula  (40)  from  the  standpoint  of  the 
results  of  Sec.  3.4.  This  is  precisely  the  case  where  the  function 


/(*)  = 


l 

1 + X2 


vanishes  together  with  all  its  derivatives  at  the  end- 
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points  x = ± °°  of  the  interval  of  integration.  Since  all  these  deri- 
vatives are  devoid  of  discontinuities,  from  Sec.  3 A it  follows  that 
integral  /2( co)  tends  to  zero,  as  <o  0,  faster  than  any  negative  power 
of  co ; however,  the  exact  rate  of  this  approach  is  only  found  through 
the  use  of  residue  theory.  The  measure  of  nonsmoothness  of  the 
function  f(x)  that  determines  this  rate  is  the  distance  from  the  pole 
of  this  function,  which  is  continued  into  the  complex  plane,  to 
the  imaginary  axis. 

Of  course,  so  brief  a section  as  this  could  not,  unfortunately, 
teach  the  techniques  of  integrating  complex  functions,  but  we  hope 
that  it  has  given  the  reader  a feeling  of  the  beauty  of  the  theory 
of  functions  of  a complex  variable. 

Exercises 

1.  Evaluate  the  integral  ^ around  a circle  ( L ) of  radius  4 

<*> 

with  centre  at  the  point  3z. 

oo 

C dx 

2.  Compute  the  integral  \ 

J *6  + l 


ANSWERS  AND  SOLUTIONS 


Sec.  5.1 

y 2 ^cos  ^ + i sin  — j , 5(cos  9 + i sin  9)  ^9  = arc  tan  j > 

2 ^cos  ~ + i sin  j , 3(cos  7r  + i sin  n),  1 (cos  0 + i sin  0), 
0(cos  9 + i sin  9)  (9  arbitrary). 


Sec.  5.2 

1.  If  — = w,  then  zx  — z2w,  whence  z\  = (z2w)*  — z2w*, 

i.e.  w'  = — • 

4 

2.  Since  the  second  root  is  equal  to  2 + i,  the  left-hand  member 

must  be  divisible  by  [z  — (2  — *)]  [z  — (2  i)}  = z2  — 4z  + 5. 

Performing  the  division  and  solving  the  remaining  quadratic 
equation,  we  get  zai  — 1 ± V 3. 

3.  If  z = x + iy,  then  z*  = x — iy  = x + i(—  y)  and 
z"  = x — i(—y)  = x + iy  = z. 

4.  zz*  = x2  + y2  = \z\2.  It  is  interesting  that  the  real  nature 
of  the  product  zz * may  be  proved  on  the  basis  of  the  properties 
of  conjugate  numbers  without  passing  to  the  real  and  imaginary 


200 


Functions  of  a complex  variable  CH.  5 

parts:  (zz*)*  = z'z”  = z*z  = zz*.  Now  a number  equal  to  its 

conjugate  is  of  necessity  real. 

Sec.  5.3 

1.  (a)  1 -f-  i — ]/  2e  4 , (b)  i — i = j/  2e  4 * = \ 2c  ’ 4 , (c)  — 1 = ein, 

.TC 

(d)  3 i = leT. 

2.  (a)  First  write  the  number  1 + * in  exponential  form  to  get 

— t—  _ _ t-TC 

1 -f- - i = y 2e  4 , whence  (1  + z) 16  — 2^ 4 )1B  --  (]/  2) 10  • e 4 — 

= 2*-ei4*=  2 a = 256,  (b)  - 1. 

3.  cos  3^  = cos3  cp  — 3 cos  9 sin2  9,  sin  4 9 = 4 cos3  9 sin  9 — 

— 4 cos  9 sin3  9. 

4.  In  Euler’s  formula  e{*  — cos  9 + i sin  9,  we  replace  9 by  — 9 
to  get  e -**  = cos  9 — i sin  9.  Adding  these  two  formulas  term 

ei(9  4.  e -*<P 

by  term,  we  get  e"*  + e~i<p  = 2 cos  9,  whence  cos  9 = 2 * 

The  second  formula  is  obtained  by  term-by-term  subtraction. 


Sec.  5.4 

1.  (a)  In  (-  I)  = i{-  + 2kn),  (b)  In  i = + 2/f7t),  (c)  In  (-*)  = 

= *'(y+  2Atc),  (d)  ln(l  +*)  =iln2  + *'(-J+2A-). 

2.  Since  — 1 = cos  iz  + i sin  the  general  form  of  numbers  xk 
such  that  %l  = — 1 is 

7T  + 2AtT  . • • TT  -h  2&7T  /7  rv  1 

xk  — cos f-  1 sin [k  — 0,  1,2) 

3 3 

Therefore 


xn  = cos 


z sin  — = — \- 
3 2 

. . . 5tc  1 


i _ *,  = cos  7T  + i sin  - = — 1, 

2 


x9  = cos  — 4-  t sin  — = 1 — • 


<3  , 1 1 ■ V3 

3.  1,  — + % — 

2 2 


1 . . K3 
— + 1 — 

2 2 


• V3  1 • 1 3 

X » l 

2 2 2 


4.  On  the  basis  of  Exercise  4,  Sec.  5.3,  we  arrive  at  the  equation 

- — 4 — = whence,  after  a few  simple  manipulations,  we  get 


e1'9  — 4e'*  +1=0 

This  is  a quadratic  equation  in  e'9 ; solving  it,  we  get  e'9  = 2 + f 3. 

From  this  we  have  itp  — In  (2  ± ^ 3)  + 2km  (k  any  integer). 
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5. 


Dividing  by  i and  noting  that  2 — ^3  = — = and  therefore 

2 4-  y 3 

In  (2  — - y 3)  = — In  (2  + y 3),  we  get  the  final  answer:  ? = 
= ± In  (2  + ]^3)  i + 2kiz  — ^ 1.317/  -f*  2^7 r.  All  these  solutions 
are  imaginary. 

In  similar  fashion  we  obtain  o = ± In  .(2  + }/  3)  i + — + Ikr:. 


Sec.  5.5 

m 

1.  / = T—r  el0it.  We  thus  get  the  same  formulas  as  for 

R + i\<*L 

{ o>C) 

the  case  of  the  RL  circuit  discussed  in  the  text;  but  in  place 

of  toZ,  we  have  to  substitute  to L — . In  the  special  case  of 

co  C 

to2  = — , we  get  coZ, = 0,  which  is  to  say  the  inductance 

LC  coC 

and  capacitance  cancel,  as  it  were. 

2.  J = (—  + — ~ ] ^Oe^eioit,  whence 

\ R uoL  / 

Jo  = IP  + JJD  ?<>’  a = P + arg  (R  -f  i^L)  — A 
for  the  same  meanings  of  the  notation  as  in  the  text. 


f(z)  — (x  + iy )3  = a;3  + i$x-y  — 3.vv2  — /y3,  whence 
« = a'3  — 3 xy2,  v ~ 3 x2y  — v3,  — = — = 3x-  — 3y2, 

Bx  By 

Bv  Bit  ^ 

= 6xy. 

ox  By 


Sec.  5.7 

1.  From  the  condition  n = x x2  — y2  we  get  — = 1 -f  2x. 

Bx 

Using  the  first  of  the  Cauchy-Riemann  conditions,  we  get 

— = 1 + 2x.  To  find  v from  this  condition,  it  is  sufficient  to 
By 

integrate  the  equation  with  respect  to  y,  holding  x constant. 
This  yields 

v(x,  y)  = y + Ixy  + «?(*)  * 


Since  x is  constant  in  the  integration,  the  role  of  the  constant  of  integration 
can  be  taken  by  any  function  9 that  is  dependent  solely  on  the  single  variable  x . 
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We  now  find  — = 2y  + But  according  to  the  second  of 

dx 

the  Cauchy-Riemann  conditions,  — = — — • Since  — = — 2 y, 
J dxdydy 

we  get  2 y + q>'(#)  = 2 y,  whence  yf(x)=  0.  That  is,  y(x)  = C 
where  C is  a constant.  Hence  v(xf  y)  = y + 2*jy  + C. 

To  determine  C take  advantage  of  the  condition  f(0)  = 0.  It 
means  that  u = 0,  v — 0 for  # = 0,  y = 0.  Thus,  v = 0 for 
# = 0,  y = 0,  and  so  C = 0,  y)  = jy  + 2#y. 

2.  y)  = — %2  + y2  + !• 

Sec.  5.8 

(a)  T > T ’ (b)  ~ ni>  7W’«  (c)  T ( 1 + i)>  T (— 1 + 0 (we  first  write 
3 3 o 3 

the  result  for  the  upper  semicircle  and  then  for  the  lower  one). 
In  the  case  of  (a)  the  integrand  is  everywhere  analytic  and  there- 
fore the  integral  is  independent  of  the  path  of  integration.  In 
case  (b)  the  integrand  is  infinite  at  £ = 0,  in  case  (c)  it  has  a 
branch  point  there. 

Sec.  5.9 

1.  The  integrand  has  two  first-order  poles  inside  (L) : z1  = 0, 
z2  = 2; zi.  Therefore  the  integral  is  equal  to 

2 ni  (Res2=0  — b Res z=2rj — - — ) = 27W  (—  +-^-1  = i 

^ ez  — 1 ez  — 1 ) ^ e ) 

2.  Like  Example  (34),  the  integral  here  is  equal  to  the  integral  of 

the  function  f(z)  = 1 over  the  contour  (LR)  of  Fig.  67  for 

1/3 

large  R.  But  within  (. LR),f(z ) has  three  simple  poles:  zx  = -*y  + 
+ i ~ > *2  = and  so  the  integral  is  equal  to 


In  this  example,  the  indefinite  integral  can  be  expressed  in  terms 
of  elementary  functions,  but  the  method  of  evaluation  given  here 
is  much  simpler. 


Chapter  6 

DIRAC’S  DELTA  FUNCTION 


6.1  Dirac’s  delta  function  §(:*:) 

Take  the  function  y = 4>i(*)  with  a maximum 
at  x = 0 and  rapidly  decreasing  on  both  sides 
of  x = 0 ; a function  such  that 

+ oo 

C <bx{x)  dx  = 1 


These  conditions  do  not  in  any  way  determine 
the  type  of  function  02(%),  because  it  is  easy 
to  think  up  a variety  of  functions  that  satisfy 
all  the  above  requirements.  For  example, 


®i(*)  = 1 
71 


1 

1 + *2 


(1) 


<M*)  =-Le-**" 
1V 


(2) 


The  numerical  factor  ensures  that  the  integral  equals  unity.  The 
graphs  of  these  functions  are  shown  in  Fig.  68.  Now  transform  the 
curve  y = Q>i(x)  as  follows:  increase  the  height  w-fold  and  diminish 
the  width  the  same  number  of  times.  It  will  be  recalled  (see,  for  exam- 
ple, HM,  Sec.  1.7)  that  if  we  increase  the  height  of  a curve 
y ~ <t>x(x)  m times,  then  the  equation  of  the  curve  takes  the  form 
y = m and  if  we  diminish  the  width  m times,  the  equation 

becomes  y — O ^mx).  Thus,  after  both  transformations,  the  equation 
of  the  curve  becomes  y =<&m(x)  = tntb^mx).  For  example,  from 

(1)  we  get  O m{x)  = — . It  is  clear  that  the  area  between  the 

7C  1 -f  {mx)2 

curve  and  the  *-axis  increases  m-fold  when  stretched  upwards 
and  diminishes  the  same  number  of  times  when  squeezed  at  the  sides, 
which  means  that  it  remains  unchanged.  Incidentally,  this  can  readily 


* This  chapter  is  closely  related  to  the  material  of  HM,  Ch.  9. 

-f  CD 

**  It  can  be  demonstrated  that  f e~*z  dx  =/7r.  See  Sec.  4.7. 
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Fig.  68 


be  demonstrated  also  by  integration  after  changing  the  variable  of 
integration  mx  = s: 

co  oo  OO 

^ <bm{x)  dx=  ^ m&^mx)  dx  — ^ 0 1(w.r)  d(mx) 

— oo  — ao  — X 


00  00 

= J ®a(s)  ds  = J ®i(*) 

_ oo  — oo 


dx 


What  form  does  the  transformed  function  take  for  very  large 
m or,  to  put  it  more  precisely,  in  the  limit,  when  m increases  beyond 
all  bounds?  For  any  fixed  x =£  0,  the  quantity  y = m$>x{mx)  will 
approach  zero  without  bound  as  m increases  without  bound  because 
<I>i (mx)  decreases  faster,  with  increasing  m , than  the  growth  of  the 
factor  m.  For  this  it  is  necessary  that,  as  x ->  ± oo,  approach 

zero  faster  than  — (this  is  what  is  meant  by  saying  that  the  function 
* 

is  rapidly  decreasing  one).*  For  example,  in  the  expression 
$>m{x)  = — for  a given  x 0 and  for  sufficiently  great  m 

7Z  1 -f  {nix)2 

it  will  be  true  that  (mx)2f>  1 and,  hence,  <bm(x)  a — = - 

m2x2  tc  mx2 

which  is  to  say  that  it  decreases  indefinitely  as  m increases. 
The  quantity  <bm(x)  obtained  from  formula  (2)  decreases  faster  than 

m increases.  Indeed,  in  this  case  ®m(*)  = and  we  know 

V K 

that  an  exponential  function  with  a negative  exponent  decreases 
faster  than  any  power  of  m (see,  for  example,  HM,  Sec.  3.21). 

Now  let  x — 0.  Then  Q^mx)  =®x(0)  is  constant  for  any  m 
and,  therefore,  ®m(0)  = raO^O)  increases  without  bound  with  increas- 
ing m. 


Such  a rate  of  decrease  automatically  follows  from  the  convergence  of  the  inte- 
gral (see  Sec.  3.1). 
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Consequently,  by  increasing  m without  bound  we  obtain  a func- 
tion with  the  following  properties : 

(1)  the  function  is  zero  for  all  x < 0 and  for  all  x > 0; 

(2)  the  function  is  infinite  when  x = 0; 

(3)  the  integral  of  this  function  taken  from  - oo  to  + oo  is 
equal  to  1. 

A function  having  these  properties  is  termed  the  Dirac  delta 
function , denoted  by  £(#).  * The  delta  function  is  extremely  convenient 
and  is  broadly  used  today  in  physics. 

We  arrived  at  the  concept  of  the  delta  function  by  considering 
ordinary  familiar  functions  and  transforming  them  in  a special  way. 
It  is  a remarkable  fact,  however,  that  to  use  the  delta  function  it 
suffices  to  know  the  three  properties  that  we  listed  above  and  one 

does  not  at  all  need  to  know  from  which  function  (—  — - — or  ^e~xi 

\ 7T  1 + x2  V n 

or  any  other  one)  the  delta  function  is  derived.  Crudely  stated, 
the  delta  function  is  a function  that  assumes  large  values  on  a 
narrow  interval,  and  these  values  are  in  agreement  with  the  width 
of  the  interval  in  such  a manner  that  condition  (3)  holds  true.  (From 
this  there  follows,  for  one  thing,  that  the  dimensions  of  [£(#)]  ==  1 /[#].) 
From  the  properties  of  S(at)  follows  the  basic  relation 

4-  oo 

I = J *(*)  /(*)  dx  = /(0)  (3) 

— oo 

Indeed,  3(*)  = 0 for  all  x 0 and  so 

-foo  4-e 

I = J 8{x)  f{x)  dx=^  *(*) /(*)  dx 

— oo  — e 

where  e is  a small  quantity.  In  the  latter  integral  the  interval  of 
integration  is  small  (it  is  of  length  It),  and  so,  there,  f(x)  « f(0); 
consequently 

4-e  4-e  4-e 

I = ( 8(x)  f{x)  dx  = ^ 8{x)  f{ 0)  dx  = /( 0)  J S(^)  dx 

— t — e — e 

4-oo 

=/(0)  J 8(x)  dx  =/( 0) 


Paul  Adrien  Maurice  Dirac,  in  honour  of  whom  this  function  is  named,  is  the 
celebrated  English  theoretical  physicist  who  in  1929  predicted  the  existence  of 
antiparticles:  the  positron,  the  antiproton,  and  others,  which  were  later  disco* 
vered  experimentally. 
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To  summarize,  then,  the  formula  (3)  follows  from  the  three  pro- 
perties of  8(x).  The  converse  is  valid  as  well:  if  we  define  S(x)  as  a 
function  for  which  the  relation  (3)  holds  for  any  f(x),  then  from  this 
follow  all  the  three  properties  of  the  delta  function.  Without  dwelling 
on  a detailed  proof  of  this  fact,  we  will  show,  for  example,  that  (3) 
implies  the  first  property:  8(#)  = 0 for  x =£  0. 

Indeed,  from  (3)  it  is  clear  that  the  value  of  the  integral  does  not 
depend  on  the  behaviour  of  the  function  f{x)  for  x =£  0,  but  depends 
only  on  /( 0).  This  means  that  f(x)  stands  under  the  integral  sign  with 
a factor  equal  to  zero  for  x =£  0,  i.e.  8(#)  =0  for  x =£  0. 

Note  that  8(#  — a)  is  different  from  zero  (it  is  infinite)  only  when 
x=a.  Reasoning  as  in  the  derivation  of  formula  (3),  we  get  the  formula 

-f-  OO 

^ 8(#  — a)  f(x)  dx  = f(a)  (4) 


It  is  worth  noting  certain  other  interesting  formulas  for  the  delta 
function.  First, 

§{ax)  — 8(#)  (a  = constant  =£  0) 

I « I 

Indeed,  the  function  | a | 8 (ax)  satisfies  all  three  properties  that  define 
a delta  function.  The  slightly  less  obvious  third  property  is  verified 
via  the  substitution  ax  = x1  thus: 

GO  00  GO 

a > 0,  ^ [ a | S (ax)  dx  = ^ $(ax)  a dx  = ^ $(xx)  dxx  = 1 

— OO  — OO  — 00 

00  <30 

0,  | a | S {ax)dx=  — ^ 8 (a#)  a dx 

— OO  — OO 

— OO  OO 

= — ^ ^{Xj)  dx1  = ^ dx-L  = 1 


a < 


: &(*  — X0) 


And  yet  another  property: 

*(*(*))  =TTT 

I 9 (*o)  I 

if  <p(x)  vanishes  only  when  x = x0.  This  property  follows  from  the  pre- 
ceding one  since  close  to  x = x0,  to  within  higher  infinitesimals,  we 
can  write 

<?(*)  = <P(*o)  + <p'(*o)  (x  — xo)  = ?'(*«)  ( x — xo ) 

Finally, 

f(x)B (x  - a)  = f(a)  B(x  - a) 

which  follows  immediately  from  the  fact  that  the  function 
8(*  — a)  is  equal  to  zero  for  x =£  a. 
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Many  physical  relations  are  very  conveniently  written  with  the  aid 
of  the  delta  function.  Consider,  for  instance,  a narrow  rod  with  loads 
suspended  at  a variety  of  points.  Let  the  dimensions  of  the  loads  be 
small  compared  to  the  length  of  the  rod,  and  their  masses  of  the  same 
order  as  that  of  the  rod.  Then  when  solving  problems  (determining 
the  total  mass,  the  equilibrium  position,  and  the  like),  one  has  to  con- 
sider both  the  masses  of  the  loads  (called  the  localized  masses)  and 
the  mass  of  the  rod  (distributed  mass).  Suppose  the  density  of  the 
rod  is  pdis(^).  * Then  the  mass  of  the  rod,  one  endpoint  of  which  lies  at 
the  coordinate  origin  and  the  other  end  at  the  point  x = /,  is  equal 

i 

to  Md is  = ^ Pdis(^)  dx, 

6 

Suppose  there  is  a weight  ma  at  the  point  x = a of  the  rod.  Then 
the  total  mass  of  the  rod  and  the  weight  is 

i 

m = ma  -\-  ^ pdis(^)  dx 
0 

Using  the  delta  function,  we  can  represent  the  localized  mass  as  a mass 
distributed  with  density  pioc(^)  = fna8(x  ~ a )•  Indeed,  the  last  for- 
mula means  that  the  density  is  different  from  zero  only  in  the  small 
neighbourhood  of  the  point  x — at  and 

i i 

^ pioc(^)  dx  = ma  ^ S(x  — a)  dx  = ma 

0 0 

This  means  that  by  introducing  the  function  pioc(^)  we  can  write 
the  localized  mass  in  a form  that  in  aspect  coincides  with  the  notation 
for  a distributed  mass. 

Now  set 

p{x)  = pdis(#)  + pioc(*)  = Pdis(*)  + ma8(x  — a) 

Then  the  total  mass  is 

i 

m = ^p{x)  dx 
o 

In  other  words,  the  total  mass  need  not  be  written  as  the  sum  of  terms 
of  different  types,  but,  via  the  integral,  in  the  same  manner  as  the 
distributed  mass.  The  different  nature  of  the  distributed  and  localized 
masses  only  affects  the  aspect  of  the  function  p(#).  Thanks  to  this,  we 
can  now  write  in  a unified  manner  all  the  relations  that  we  have  al- 
ready obtained  for  a distributed  mass. 


The  subscripts  "dis"  and  "loc“  stand  for  "distributed"  and  "localized",  respec- 
tively. 


208  Dirac’s  delta  function  CH.  6 

For  example,  the  mass  between  the  points  x = b and  x = c of 

c 

the  rod  is  equal  to  ^ p(x)  dx.  Xo  stipulations  need  be  made  now  that 

b 

b > a or  that  b < a}  and  so  forth:  the  function  p(x)  contains  S(x  — a), 
and  integration  of  the  function  8(x  — a)  automatically  adds  to  the 
mass  of  the  rod  the  mass  of  the  load,  provided  that  the  load  is  located 
between  the  points  x — b and  * = c. 

In  the  same  way,  the  position  of  the  centre  of  gravity  of  the  rod 

i 

^ xp(x)  dx 

is  given  by  the  formula  *c  = ^ , irrespective  of  whether 

^ p(x)dx 
u 

there  is  a localized  mass  on  the  rod  or  not. 

The  case  of  several  localized  masses  is  considered  in  exactly  the 
same  way. 

Let  us  take  another  example.  In  mechanics  one  considers  forces 
smoothly  varying  with  time  and  also  sharp  impacts,  collisions  of  bodies. 
In  the  case  of  an  impact,  the  body  is  acted  on  by  a large  force  during 
a brief  time  interval.  In  most  cases,  any  detailed  study  of  the  depen- 
dence of  the  force  on  the  time  during  which  the  impact  lasted  is  of 
no  interest  (see,  for  instance,  HM,  Sec.  6.5).  It  is  enough  to  know  the 

impulse  i.e.  Iimp  = ^ F dt.  Then  the  force  at  impact  can  be  written 

thus:  F(t)  = /jmpS(£  — Amp),  where  AmP  is  the  time  of  impact  and 
/imp  is  the  impulse  of  the  impact.  This  notation  shows  that  the  force 
is  different  from  zero  only  at  the  instant  of  impact  and  the  impulse 
is  equal  to  I imp. 

Note  in  conclusion  that  the  delta  function  is  considered  only  for 
real  values  of  the  independent  variable. 

Exercises 

00 

1.  Evaluate  ^ x2§(x  — 3)  dx . 

— oo 

2.  Simplify  the  expressions: 

(a)  (**+3)S (x  + 5);  ( b ) S(2x  - 8) ; 

(c)  8(*2  + x - 2). 

6.2  Green’s  function 

Let  us  first  consider  an  example.  Suppose  a thin  flexible  string 
of  length  l is  stretched  along  the  #-axis  by  a constant  force  F.  In 
the  system  depicted  in  Fig.  69,  this  tension  is  accomplished  by  a block 
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Fig.  69 


and  weight.  Let  the  string  be  acted  on  perpendicular  to  the  *-&xis  by 
a force  distributed  with  density  f(x) ; that  is,  a force  f(x)  dx  operates 
over  a small  section  of  the  string  between  the  points  x and  * + dx,  and 

i 

the  force  ^ f{x)  dx  over  the  whole  string.  Let  us  find  the  shape  oiy(x) 
o 

that  the  string  then  takes  up.  Here  the  function  y(x)  is  the  deflection 
of  that  point  of  the  string  which  was  at  a;  of  the  #-axis  in  the  original 
state. 

We  will  assume  that  the  tension  F of  the  string  is  much  greater 
than  the  total  force  acting  on  the  string  so  the  deflection  is  slight. 
Then  we  can  take  advantage  of  the  law  of  linearity,  according  to 
which  the  corresponding  deflections  in  the  case  of  several  superimposed 
weights  are  additive. 

To  begin  with,  let  us  suppose  that  the  applied  weight  is  of  a special 
type,  namely  , that  it  is  a unit  localized  weight  applied  to  a certain 
point  of  action  E,  of  the  #-axis ; we  denote  by  y = G(x,  E)  the  correspond- 
ing deflection  at  any  point  of  observation  x (Fig.  70).  This  function 
G(x,  E)  is  called  the  influence  function  or  Green  s function  (after  the 
English  mathematician  George  Green  (1793—1841))  of  this  problem. 
We  will  now  show  that  if  it  is  known,  then  it  is  also  easy  to  find  the 
deflection  due  to  the  action  of  an  arbitrary  weight  with  density  f(x). 

Consider  the  load  on  the  section  of  the  axis  from  point  E,  to  point 
E,  + d £.  It  is  equal  to /(£)  d%;  and  so  the  deflection  due  to  it  at  the 
point  a;  is  equal  to  G(x,  £)/(£)  since  it  follows  from  the  law 
of  linearity  that  if  an  external  load  is  multiplied  by  a constant  factor,, 
then  the  deflection  is  likewise  multiplied  by  that  factor.  Adding 
together  such  infinitesimal  deflections  due  to  all  elements  of  the  load 
from  E = 0 to  1 = 1,  we  get  the  overall  deflection  (see  Fig.  69) : 

i 

y = h(X)  = \G(x,Z)M)dl  (5) 

0 

In  the  example  at  hand  it  is  easy  to  write  down  the  function 
G(x,  E)  in  explicit  form.  Find  the  components  of  the  tensile  force  of 
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Fig.  70 


the  string  along  the  y-axis.  To  the  left  of  the  point  !;  it  is  equal  (see 
Fig.  70)  to 

— F sin  a - - F - 

5 

where  z is  the  deflection  of  the  point  £ which  is  not  specified  before- 
hand. Note  that  in  this  derivation  we  took  advantage  of  the  small 
nature  of  the  deflections  and  for  this  reason  we  replaced  the  hypotenuse 
of  the  triangle  by  the  larger  leg  when  computing  the  sine.  In  a similar 
manner,  we  obtain  the  component  of  the  tensile  force  to  the  right  of  E, : 


— F — — 

If  under  the  given  force  the  string  is  in  equilibrium,  this  means 
that  the  sum  of  all  forces  acting  on  the  string,  that  is,  the  sum  of  the 
forces  of  tension  and  the  given  force,  is  equal  to  zero.  For  this  reason, 
the  sum  of  the  components  of  these  forces  along  the  y-axis  is  also  zero. 
Since  in  our  case  the  given  force  is  equal  to  1 and  acts  along  the  y-axis, 
we  get,  on  the  basis  of  the  foregoing, 


1 —F  — — F — — = 0 

whence  we  find  2 = — — . If  2 is  known,  then  the  deflection  of  any 

Fl 

point  of  the  string  can  readily  be  found  by  using  the  fact  that  the  string 
has  the  shape  of  a broken  line.  We  obtain 

v{x)  = z — if  # < !;, 

5 

v{x)  = z if  x > \ 


Substituting  the  value  of  2 just  found  and  recalling  that  the  shape  of 
the  deflection  under  a unit  localized  load  yields  Green's  function,  we  get 


G(x,  l)  = 


w- 


if  x < 
if  ^ ^ 
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This  expression  for  Green's  function  may  be  substituted  into 
(5)  for  the  deflection  due  to  an  arbitrary  load.  Since  G(x,  D for  £ < x 
and  \ > x is  written  with  the  aid  of  different  formulas,  the  integral 
is  split  into  two  integrals: 

* i 

y = h{x)  = \G(x,  $)  M)  di  + $ G(x,  Dm  d% 

0 x 


x l 

= l-^\im  + as 

0 x 

The  same  result  may  be  obtained  by  writing  the  differential  equa- 
tion for  the  function  y(x)  and  solving  it.  However,  the  remarkable 
thing  is  that  we  succeeded  in  finding  the  solution  without  even 
writing  down  the  differential  equation.  All  we  needed  to  know  was 
that  the  law  of  linearity  is  applicable. 

Now  let  us  take  up  the  general  scheme  for  constructing  the  in- 
fluence function.  Let  the  external  action  exerted  on  an  object  be  de- 
scribed by  the  function  f(x)  (a  ^ ^ b ; in  the  example  above  this 

was  the  function  f(x))  and  let  the  result  of  the  action  be  given  by  the 
function  F(x)  (this  was  h(x ) in  the  foregoing  example).  We  can  imagine 
that  to  every  given  function  / there  corresponds  a new  function  F , 
that  is  to  say,  every  function  / is  transformed  via  some  specific  law 
into  a new  function  F.  In  mathematics,  such  a law  of  transformation 
of  preimage  functions  into  image  functions  is  termed  an  operator.  For 
example,  the  familiar  differentiation  operator  D,  which  operates  via 
the  law  Df=  f',  i.e.  D(sin  x)  = cos  x,  D(x 3)  = 3x 2,  and  so  on.  Here, 
sin  a;  is  the  preimage  (inverse  image)  that  is  transformed  by  the  opera- 
tor D into  the  image  cos  #.  The  notion  of  operator  is  similar  to  that 
of  function,  but  whereas  the  function  gives  a law  for  the  transformation 
of  numbers  (values  of  the  independent  variable)  into  numbers  (values 
of  the  dependent  variable),  an  operator  transforms  functions  into 
functions. 

Let  us  denote  by  L the  transition  operator  from  the  function 
of  external  action  f(x)  to  the  response  function  F(x),  so  thatF  = Lf. 
We  assume  that  the  law  of  linearity  (or  the  principle  of  superposition) 
holds:  this  means  that  when  external  actions  are  added,  so  are  their 
results.  This  law  holds  true  with  sufficient  accuracy  when  the  external 
actions  are  not  too  great.  It  can  be  written  as  follows: 

L(f1+f2)  = Lf1  + Lfz  (6) 

An  operator  having  this  property  is  called  a linear  operator.  (To  illus- 
trate, verify  that  the  differentiation  operator  is  linear.)  From  this 
property  we  can  conclude  that  the  result  obtained  by  multiplying 
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an  external  action  by  a constant  is  also  multiplied  by  that 
constant : 

L(Cf)  = CLf  (C  constant)  (7) 

We  do  not  give  the  proof  here.  (Try  to  justify  it  by  first  assuming 
C to  be  a positive  integer,  then  set  C = — (n  — 2,  3,  4, ...)  and  C = —, 

n n 

where  m and  n are  positive  integers,  then  for  C = 0,  and  finally  when 
C is  negative.) 

In  the  example  discussed  at  the  beginning  of  this  section  we 
regarded  G{xt  £)  as  the  result  of  the  action  of  a unit  force  localized  in 
a certain  point  £,  or,  in  other  words  (see  Sec.  6.1),  distributed  with 
density  8(x  — £).  So  in  the  general  case  as  well  we  denote  by  G(xf  1} 
the  result  of  an  external  action  described  by  the  delta  function  with 
a singularity  at  some  fixed  point  £,  that  is,  the  function  S(x  — £). 
Thus 

G(x,  l)=L[S(x-l)j 

How  can  we,  using  Green’s  function  G(x,  £),  express  the  result 
of  transforming  any  given  function  f(x)  ? To  do  this,  represent  / as 
a sum  of  “column”  functions  (Fig.  71),  each  of  which  is  localized  in 
the  single  point  \ and  is  zero  outside  an  infinitesimal  neighbourhood 
of  this  point.  Such  a column  function  is  proportional  to  the  delta 
function  8(x  — £),  and  since  the  integral  of  a column  function  is 
equal  to  /(£)  d£,  it  is  simply  equal  to  [/(£)  dQ  8(x  — E).  We  thus  get 
the  representation 

A*)  = sra 

Strictly  speaking,  in  the  case  of  infinitesimal  d \ we  ought  to  write 
the  integral  sign  instead  of  the  summation  symbol,  so  that  actually 
this  is  formula  (4)  in  different  notation. 

But  the  law  of  linearity  for  sums  carries  over,  in  the  limit,  to  integrals 
as  well. 

By  virtue  of  property  (7),  every  column  function  is  transfor- 
med to 

mm  dii  *(x-i)i=[m  di]  =m  g(X,  d d\ 

And  so  the  sum  of  such  functions,  by  property  (6),  is  transformed  to 

£[Sr/(5)  dQ*(x-  = G(X,  l)  di 

This  sum,  for  infinitesimal  d\,  is  an  integral,  and  we  finally  get 

b 

F(x)=L[f(x)]  = ^G(x,l)f(  l)dl 

a 

(compare  with  formula  (5)). 


(8) 
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Fig.  71 


The  influence  function  can  be  computed  theoretically  in  the 
simpler  cases,  as  in  the  foregoing  example,  and  experimentally  in  the 
more  complicated  problems  by  carrying  out  the  necessary  measure- 
ments (for  example,  measuring  the  deformation  of  a system  under 
the  action  of  a localized  force) . In  such  cases, the  possibility  of  applying 
the  superposition  principle  (or,  as  we  say,  the  linearity  of  the  system) 
follows  either  from  general  theoretical  principles  or  can  be  verified 
experimentally.  After  Green's  function  has  been  found  and  the  linea- 
rity of  the  system  established,  the  solution  of  the  problem  is  given  by 
formula  (8)  for  any  external  action  /. 

Thus,  at  times,  even  the  most  general  conceptions  of  the  pro- 
perties of  physical  systems  point  the  way  to  a solution  of  specific 
problems. 

In  conclusion,  note  that  the  image  functions  Lf  need  not  ne- 
cessarily be  defined  on  the  same  interval  as  the  preimage  functions /; 
what  is  more,  the  independent  variables  x and  £ in  (8)  can  have 
different  physical  meaning.  The  independent  variable  £ can  play 
the  role  of  time ; then  the  influence  function  describes  the  result  of 
a “unit  impulse"  acting  at  time 
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Exercise 

Write  down  the  influence  function  for  the  operators: 

(a)  Lf=  2 f{x),  (b)  Lf=  sin  x-f(x),  (c)  Lf  = f(x  + 1), 

(d)  £/  = /(**). 

6.3  Functions  related  to  the  delta  function 

The  delta  function  has  been  found  useful  in  writing  down  certain 
other  important  functions.  An  important  example  is  the  unit-step 
function  e(x)  (also  denoted  as  0(*)): 

X 

e(x)  = ^ 8(x)  dx  (9) 

— oo 


Clearly,  for  x < 0 we  get  e(x)  = 0 and  for  x > 0 we  get  e(x)  = I. 
Thus,  e(x)  is  a discontinuous  function  that  suffers  a jump  at  x = 0. 
Its  graph  (“step”)  is  shown  in  Fig.  72.  It  results  from  a sudden  switch- 
ing-in of  some  constant  action,  say,  voltage  in  an  electric  circuit 

(the  independent  variable  there  is  the  time). 

Equation  (9)  can  also  be  obtained  with  the  aid  of  approximate 
representations  of  the  delta  function.  In  Sec.  6.1  we  saw  that  one 

such  representation  is  the  function  — — for  m large.  However 

tz  1 + (mx)2 

f J_ — dx  = — arctan  nix  * — — arctan  mx  + — 

J 7T  1 + (mx)2  7C  -oo  7T  2 

— oo 


The  graph  of  this  integral  for  m — 1 and  m = 5 is  shown  in 
Fig.  73.  In  the  limit,  as  ni  oo,  we  obtain  e(x)  from  the  integral, 
which  means  we  arrive  at  (9). 

The  same  result  may  be  obtained  with  the  aid  of  the  column 
function,  whose  graph  is  shown  in  Fig.  74a  (it  too,  in  the  limit  as 
N ->  oo,  yields  the  delta  function).  The  graph  of  the  integral  of  this 
function  is  shown  in  Fig.  7 4b.  Here  too  we  get  (9)  in  the  limit  as 
N — ?*  oo. 

Equation  (9)  can  be  demonstrated  in  the  following  example  from 
physics.  Consider  the  rectilinear  motion  of  a mass  m under  the  action 
of  a variable  force  F(t)  directed  along  that  same  straight  line.  Writing 
down  the  expression  for  Newton's  second  law  and  carrying  out 
the  integration,  we  obtain  the  following  equation  (see,  for  example, 
HM,  Sec.  6 A): 

t 

v{t)  = - C F(t)  dt 
m J 
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Fig.  72 


Fig.  73 


(we  assume  that  at  t = — oo  the  initial  velocity  is  zero).  Let  the  force 
be  in  the  nature  of  an  impact:  F(t)  ~ I\mv8{t  ~ timv)  (see  end  of 
Sec.  6.1).  Integrating,  we  get 

t 

v(t)  = - ( - /imp)  dt  = lass. e(t-  iimp) 

m J m 

— CO 

which  is  to  say  that  the  velocity  v is  equal  to  zero  prior  to  the  impact 
and  is  equal  to  after  the  impact. 

m 

Now  let  us  return  to  mathematics.  If  one  differentiates  (9),  the 
result  is 

e'(x)  = 8 (x)  (10) 

This  equation  can  also  be  shown  in  the  examples  we  have  just  dis- 
cussed. For  an  approximate  representation  of  the  function  e(x)  we 
can  take  the  function  given  in  Fig.  14by  the  derivative  of  which 
is  shown  in  Fig.  74a;  in  the  limit,  as  N oo,  we  get  (10). 
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Fig.  74 


Here,  we  proceeded  from  the  delta  function  and  arrived  at  the 
unit-step  function  with  the  aid  of  integration.  We  could  go  in  the  reverse 
direction:  we  could  proceed  from  the  function  e(x)  and  obtain  the 
function  S(x)  with  the  aid  of  differentiation.  In  other  words,  the  delta 
function  is  obtained  from  the  discontinuous  function  e(x)  via  differen- 
tiation. In  like  manner,  delta  terms  appear  when  differentiating  any 
discontinuous  function. 

Consider,  say,  the  function  f(x)  specified  by  two  formulas: 

/(,)=l‘s  «»<*<'• 

I*2  + 2 if  1 < x < oo 
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The  graph  of  this  function,  which  has  a discontinuity  at  x=  1,  is  shown 
by  the  heavy  lines  in  Fig.75.lt  would  be  a mistake  to  equate  f'(x)  merely 
to  the  function  <p(x)  obtained  by  differentiation  of  both  formulas: 


?(*)  = 


3*2 

2x 


if  0 < * < 1, 
if  1 < x < oo 


(") 


Actually,  if  we  integrate  the  last  function,  say,  from  the  value  x = 0, 
then  we  get 


for  0 < x < 1 : ^ <p(*)  dx  = ^ 3x2  dx  = x3 

o o 

* 1 

for  x > \ : ^ y(x)  dx  = ^ y(x)  dx 

0 0 

x 1 x 

+ ^ <p(#)  dx  — ^ 3x2  dx  + ^ 2x  dx  = x3  j1  -f  *2  j*  = x2 
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Fig.  76 


Thus,  we  do  not  get/(*)  but  rather  a continuous  function  f^x),  whose 
graph  for  x > 1 is  shown  in  Fig.  75  by  the  dashed  line.  In  order  to 
obtain  f(x)  from  fx(x),  we  have  to  add  to  the  first  function  the  “step” 
with  discontinuity  at  x = 1 equal  to  the  discontinuity  of  the  function 
f{x ),  that  is  to  say,  equal  to 

/(l  + 0)  -/( 1 - 0)  = (12  + 2)  - 13  = 2 * 

And  so  f(x)  = fx(x)  + 2e(x  — 1)  whence  we  finally  get 
f'(x)  =f[(x)  + 2e'(x  - 1)  = 9(x)  + 2$(*  - 1) 

where  9(x)  is  given  by  the  formulas  (11). 

Closely  related  to  the  function  e(x)  is  the  signum  function,  sgn  x , 
which  can  be  defined  thus: 


sgn  x = 

\x\ 

It  is  equal  to  — 1 when  x < 0 and  to  + 1 when  x > 0,  which  means 
it  indicates  the  sign  of  the  number  x . The  validity  of  the  relation 


sgn  x = 2e(x)  — 1 

is  readily  seen. 

Integrating  the  function  e(x)  we  get  a continuous  function  (the 
graph  is  shown  in  Fig.  76)  since 


^ e(x)  dx) 


for  x < 0 is  equal  to  ^ 0 dx  = 0 
— 00 

0 x 

for  x > 0 is  equal  to  ^ 0 dx  + ^ 1 dx  = x 


— oo  0 

* + M 


(check  to  see  that  this  function  is  equal  to  * ^ • ) 


The  notation  /(I  — 0)  is  convenient  for  designating  the  limit  of  the  value  of 
/(I  — e)  as  e ->  0 (e  > 0) ; the  notation  /(I  + 0)  is  similarly  defined. 
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The  delta  function  can  be  integrated  and  it  can  be  differentiated ; 
its  derivative  8'(*)  has  a still  "sharper”  singularity  than  8(x),  and  it 
assumes  values  of  both  signs.  If  we  proceed  from  an  approximate 

representation  of  the  function  8( x)  in  the  form  for  m large 

V * 

(see  Sec.  6.1),  then  we  get  an  approximate  representation  of  S'(*)  in 
the  form  of  the  function 

— e~(mx)2 1 = m2xe~^mx^ 

dx  L Y 7T  J Y 7 T 

whose  graph  is  shown  in  Fig.  77.  This  function  assumes  extremal  values 
for  x = ± ~j==r  = ± ——  f which  are  equal,  in  absolute  value,  to 

Y 2m  m 

)j — 0.47  m2.  (Check  this.)  These  values  are  proportional  to  m2 

and  not  to  m , as  in  the  representation  of  the  function  8(x). 

We  can  start  from  an  approximate  representation  of  the  function 
8(x)  in  the  form  of  a triangle,  as  shown  in  Fig.  18a  for  M large; 
then  the  function  S'(*)  will  be  approximately  represented  by  the  graph 
shown  in  Fig.  786. 

If  the  delta  function  describes  the  density  of  a unit  charge  located 
at  the  coordinate  origin  (cf.  Sec.  6.1),  then  £'(#)  describes  the  density 
of  a dipole  located  at  the  origin.  Such  a dipole  results  if  we  place  charges 
q and  -qy  respectively,  at  the  points  x — 0 and  x = l,  and  then,  while 
leaving  p = ql  (the  dipole  moment)  unchanged,  let  l approach  zero 
and  q infinity,  so  that  in  the  limit  we  get  two  equal  infinitely  large 
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Fig.  78 


charges  of  opposite  sign  at  an  infinitely  close  distance.  Prior  to  passing 
to  the  limit,  the  charge  density  is  of  the  form 

?*(*)  - ?*(*  -i)=p 

For  this  reason,  the  charge  density,  after  passing  to  the  limit  as  l ->  0, 
is  equal  to  p%'{x). 
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Integrals  involving  S' ( x ) are  evaluated  via  integration  by  parts : 

GO 

C f(x)  S'{x  — a)  dx  = f(x)  S(x  — a)  |°° 


- J f'(x)  S(*  — a)  dx  = — /'(«)  (12) 

— GO 

The  delta  function  can  also  be  considered  in  the  plane  and  in 
space.  For  example,  in  space  we  are  to  understand  the  function 
8(x,  y , z)  as  a function  equal  to  zero  everywhere  outside  the  coordinate 
origin  (0,  0,  0)  and  equal  to  infinity  at  the  origin ; a function  such  that 
the  integral  of  that  function  over  the  entire  space  is  equal  to  unity. 
It  is  easy  to  verify  that  these  conditions  are  fully  satisfied,  for  example,, 
by  the  function 

t(x,  y,  z)  = 8(x)  8 (y)  8{z) 

Thus,  a mass  m localized  in  the  point  (a>  b , c)  may  be  regarded  as  a mass 
distributed  in  space  with  density 

p(xf  y,  z)  = m8(x  — a)  S(y  — b)  8(z  — c) 


Exercises 


1.  Find  \x\\  I^T. 


3.  Verify  the  truth  of  the  formula  (12)  in  straightforward  fashion, 
taking  advantage  of  the  series  expansion  of  the  function  f{x ) 
in  powers  of  x — a and  use  an  approximate  representation  of 
the  function  S' (a:)  in  the  form  shown  in  Fig.  78 b. 


6 A On  the  Stieltjes  integral 


The  delta  function  is  directly  connected  with  a useful  extension 
of  the  integral  concept.  Let's  begin  with  an  example  first.  Suppose 
we  have  a mass  m located  on  a segment  (/)  of  the  *-axis  with  endpoints 
a and  b and  it  is  required  to  determine  the  force  with  which  this  mass 
attracts  a unit  point  mass  m0  located  at  the  point  x = c to  the  left 
of  (/)  on  the  same  axis.  The  answer  is  very  simple.  Since  by  Newton’s 
law  the  mass  dm  located  at  point  x attracts  m0  with  the  force 

441  yyf 

dF  = G — (the  proportionality  factor  G here  is  the  so-called 

(*  - c)2 

gravitation  constant),  the  total  force  is 


F = [ G = Gm0 { — - — 

3 - d2  3 (*  - cf 

(0  (0 


dm 


(13) 
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If  the  mass  m is  distributed  along  (l)  in  a manner  such  that  at 
each  point  x it  has  a finite  density  p = p(x),  then  dm  = p(x)  dx  and 
from  the  integral  (13)  we  can  go  over  to  the  ordinary  integral 

Gm0  C — — — dx 

a 

However,  as  was  pointed  out  in  Sec.  6.1,  the  mass  m may  contain 
a portion  localized  at  separate  points.  Then  the  integral  (13)  can  be 
understood  to  be  the  integral  with  respect  the  measure  m.  By  this  is 
meant  that  to  every  portion  (A/)  of  the  line  segment  (l)  (which  also 
means  to  every  point  of  this  segment)  there  corresponds  a "measure” 
(in  our  case  it  is  the  mass)  ra(A/),  and  the  law  of  addition  holds : the 
measure  of  the  whole  is  equal  to  the  sum  of  the  measures  of  the  parts. 
The  integral  with  respect  to  measure  (it  is  also  called  the  Stieltjes 
integral)  is,  in  the  general  case,  of  the  form 

$/(*)  dp.  (H) 

(') 

and,  by  definition,  is  equal  to  the  limit 

lim  £/(*,)  [x(A/4) 

r = l 

which  is  constructed  exactly  by  the  same  rule  that  the  ordinary 
integral  is  (see,  for  instance,  HM,  Sec.  2,8),  except  that  instead  of  the 
length  of  the  subintervals  (A lt)  of  the  basic  interval  (/)  we  take  their 
measure  jj.(A/J.  If  for  the  measure  we  take  ordinary  length,  then  we 
return  to  the  ordinary  definition  of  an  integral,  which  means  the  Stielt- 
jes integral  is  a generalization  of  this  ordinary  integral  (which  is 
then  often  called  the  Riemann  integral  in  order  to  distinguish  it  from 
the  Stieltjes  integral). 

If  a given  measure  has  a finite  density  p = — , then  we  can 

dx 

pass  from  the  Stieltjes  integral  to  the  ordinary  integral 

b 

^f(x)dlL  = ^f(x)p{x)dx  (15) 

(0  * 

But  if  there  are  points  with  nonzero  measure,  then,  as  we  saw  in  Sec.  6. 1 , 
the  density  p(x)  will  have  delta-like  terms.  By  assuming  such  terms 
we  can  make  the  transition  (15)  in  this  case  as  well. 
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Exercise 


Frequently,  the  measure  on  the  *-axis  is  given  with  the  aid  of  an 
auxiliary  function  g(x)  by  the  rule:  the  measure  of  the  interval 
a < * < |3  is  equal  to  g($  + 0)  — g( a — 0)  (or  simply  g((J)—  g{ a) 
if  the  function  g(x)  is  continuous).  Then  instead  of  the  integral 

(M)  we  write  ^f{x)dg{x),  and  formula  (15)  assumes  the  form 


(0 


0 l 1 

^ f(x)dg(x)  =^f(x)  g'(x)  dx.  Find^*3rf(*2),  J sin  xde[x), 

(0  a 0-1 

i 

^ cos  * de(x). 


ANSWERS  AND  SOLUTIONS 


Sec.  6.1 


1.  32  = 9. 

2.  (a)  [(— 5)2  + 3]  8(*  + 5)  = 28  8(*  + 5), 

(b)  8(2*  - 8)  = 8(2(*  - 4))  = j S(x  - 4), 

(c)  the  polynomial  P(x)  = x2  + * — 2 has  the  zeros 

x±  — 1,  x2  = — 2 and  P'(x2)  = 3,  P'(x2)  = —3,  whence 

8(x 2 + x-2)  = j 8(x  - 1)  + j8(x  + 2). 


Sec.  6.2 

(a)  28(x  — !;),  (b)  sin  x • 8(x  — 5)  = sin  v &(*  — £), 
(c)  8(x  - l + 1). 


(d)  8(*2-$)  = 


0 ('  < 0) 

^[8(X-^1)  + B(x+U)l  (5>0) 
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Sec.  6.3 

1.  sgn  a;,  2$(*). 


2. 
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J /(*)$'(*  - a)dxx  jj  f(x)4M*dx-  ^ f(x)  AM2  dx 


jj  [f(a)+f'(a)(x-a)]dx~  J f/(a) +/'(«)  (x-a)]dx\ 

a~2M  a J 

= 4Jlf*  J f/(a)  — —1  - [ f{a)  — + _Lll  = _/'  (a) 

\[JX  ' 2 M 2 4M2\  |/V  ' 2 M 2 4M2Jf  J V ' 

Passing  to  the  limit  as  M ->  oo,  we  get  the  exact  equation  (12). 


Sec.  6.4 


o,  L 


Chapter  7 

DIFFERENTIAL  EQUATIONS 


The  most  elementary  types  of  differential  equa- 
tions that  arise  when  considering  various  physical 
processes,  like  water  flowing  out  of  a vessel,  radio- 
active decay,  the  motion  of  a material  particle, 
are  handled  by  the  fundamentals  of  integral 
calculus  (see  HM,  Chs.  5,  6,  7).  Here  we  introduce 
some  general  concepts  referring  to  differential 
equations  and  will  consider  certain  classes  of 
equations  that  are  not  hard  to  solve.  The  sub- 
ject will  be  discussed  further  in  Ch.  8. 
Differential  equations  involving  a single  inde- 
pendent variable  are  called  ordinary  differential 
equations.  If  there  are  two  or  more  independent 
variables,  then  the  partial  derivatives  with  respect  to  these 
variables  enter  into  the  differential  equation.  Such  equations  are  termed 
partial  differential  equations. 

In  this  and  the  next  chapters  we  will  consider  only  ordinary  diffe- 
rential equations.  We  will  need  the  results  of  Secs.  6.1  and  6.2  and 
Euler's  formula  given  in  Sec.  5.3. 

7.1  Geometric  meaning  of  a first-order  differential  equation 

A differential  equation  of  the  first  order  is  a relation  of  the  form 

where  y is  an  unknown  function  of  x.  From  now  on  we  will  consider 
this  equation  to  be  solved  for  the  derivative: 

j^=f{x.y)  (i) 

dx 

It  turns  out  that  even  without  searching  for  an  analytic  solution  of 
y(x)  in  the  form  of  a formula,  it  is  possible  to  get  an  idea  of  the  general 
pattern  of  these  solutions  on  the  basis  of  the  geometric  meaning  of 
equation  (1).  That  is  precisely  what  we  shall  do  in  this  section. 

Let  us  recall  the  geometric  meaning  of  the  derivative  — . In 

dx 

the  #y-plane,  the  quantity  — , for  the  curve  y = y(x)r  is  equal  to 
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Fig.  79 


the  slope  of  the  tangent  line  to  the  curve.  Hence,  knowing  the  depen- 
dence of  — on  the  variables  x and  y expressed  by  (1),  we  can  find 

dx 

the  direction  of  the  tangent  line  to  the  curve  that  serves  as  the  graph 
of  the  Solution  of  (1),  and  this  direction  can  be  determined  for  any 
point  of  the  plane.  The  graph  of  a solution  of  a differential  equation 
is  called  an  integral  curve  of  the  equation. 

The  direction  of  the  tangent  line  can  be  shown  by  drawing  through 
any  point  (x,  y)  a short  straight  line  at  the  angle  0 that  satisfies  the 
condition  tan  ©=/(*,  y).* 

For  example,  let  — = x2  + y2,  then  f(x,  y)  = x2  + y2. 

dx 

Set  up  the  table 


X 

y 

dy 

dx 

= tan  0 

X 

y 

dy 

dx 

= tan  0 

-1 

-l 

2 

0 

i 

1 

-1 

0 

1 

1 

-l 

2 

-1 

l 

2 

1 

0 

1 

0 

-l 

1 

1 

l 

2 

0 

0 

0 

• There  is  no  need  to  find  the  angle  0 and  construct  it.  The  required  direction  can 
be  found  much  faster  by  laying  off  on  the  #-axis  a segment  of  unit  length,  and 

dy 

on  the  y-axis  a segment  of  length  — = tan  0 (Fig.  79). 

dx 
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Fig.  80  shows  the  directions  of  the  tangent  lines  at  each  of  the 
nine  points  given  in  the  table.  If  the  number  of  points  thus  indicated 
is  increased,  one  gets  a picture  of  the  collection  of  curves  that  satisfy 
the  differential  equation.  (See  Fig.  81  that  corresponds  to  the  equation 

— = x2  + y2.)  It  is  clear  that  the  equation  has  an  infinity  of 

dx~ 

integral  curves  and  that  one  such  curve  passes  through  every  point 
(x0,  y0)  of  the  plane.  And  so  in  order  to  isolate  from  all  the  solutions 
of  equation  (1)  some  one  definite  particular  solution , we  have  to  specify 
a supplementary  condition: 

for  some  value  x — x0  is  given  the  value  y = y0  (2) 

This  condition  is  termed  the  initial  condition  because  if  time  is  the 
independent  variable,  then  the  condition  (2)  signifies  specifying  the 
desired  function  at  the  initial  instant  of  time. 

Although  two  parameters,  x0  and  y0 , are  given  in  the  initial  con- 
dition (2),  actually  there  is  only  one  degree  of  freedom  in  choosing  a 
particular  solution  of  equation  (1),  for  the  point  ( xQ , y0)  can  move  along 
the  integral  curve  determined  by  it,  and  the  curve  of  course  does  not 
change  because  of  this.  In  that  motion  there  is  one  degree  of  freedom, 
which  is  thus  extra  in  the  choice  of  the  integral  curve ; that  is,  in  such 
a choice  there  is  actually  2—1  = 1 degree  of  freedom  (see  a similar 
discussion  in  Sec.  4.8).  In  order  to  indicate  an  essential  parameter  in 
this  choice  we  can  fix  x0  and  draw  the  vertical  line  x = x0 ; then 
various  integral  curves  will  intersect  it  at  different  heights.  This  means 
that  the  different  curves  correspond  to  distinct  values  of  y (x 0)  = y0. 

In  order  to  draw  a large  number  of  line  segments  that  yield  the 
direction  of  a tangent  line,  it  is  convenient  to  take  advantage  of  the 
following  device.  In  the  drawing,  construct  the  lines  f(x,  y)  = C for 
several  values  of  the  constant  C.  By  (1),  the  value  of  tan  0 at 
each  point  of  this  curve  is  constant  and  equal  to  C,  which  means 
that  all  the  segments  indicating  the  direction  of  the  tangent  line*  at 
any  point  of  the  curve  f(x,  y)  = C are  parallel. 

The  curves  f(x,  y)  ~ C are  called  isoclines.  In  particular,  the  curve 
f{xy  y)  = 0 is  called  the  isocline  of  zeros.  At  each  point  of  this  curve 
the  tangent  to  the  integral  curves  of  equation  (1)  is  horizontal.  The 
line  at  whose  points  the  tangents  are  vertical  is  called  the  isocline 
of  infinities . For  example,  the  isocline  of  infinities  for  the  equation 

— = 2x  + y js  the  straight  line  x — y = 1. 

dx  x — y — 1 

In  Fig.  8 1 it  is  clearly  seen  that  the  integral  curves  do  not  intersect : 
at  any  rate,  not  at  a nonzero  angle.  Indeed,  (1)  shows  that  for  given 

* Bear  in  mind  that  we  are  dealing  with  tangents  to  the  integral  curves  of  the 
dy 

differential  equation  — = f(x,  y)  and  not  with  the  tangents  to  the  curve 
dx 

f(x,  y)  — C itself. 
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x and  y there  is  only  one  definite  value  of  — ; that  is,  a curve  can 

dx 

pass  through  this  point  only  with  one  specific  slope.  A more  detailed 
investigation  shows  that  distinct  integral  curves  cannot  be  in  contact 
at  any  point  if  at  that  point  the  right  member  of  (1)  and  its  partial 
derivative  with  respect  to  y assume  finite  values.  Thus,  the  condition 
(2)  does  indeed  define  a unique  solution  to  equation  (1). 

Exercises 

1.  Find  the  isoclines  of  the  equation  — . x 2 + y2. 

dx ' 

2.  Find  the  equation  of  the  locus  of  points  of  inflection  of  the 
integral  curves  of  the  general  equation  (1);  of  the  equation 

^ = **  + y2. 

dx 

7.2  Integrable  types  of  first-order  equations 

We  now  consider  several  types  of  differential  equations  that  can 
be  solved  without  difficulty. 

I.  Equations  with  variables  separable.  These  are  equations  of 
the  form 

^ = <P  (y)-Y(*)*  (3) 

dx 

Here  the  right-hand  member  <p(y)  -Y(#)  is  a product  of  two  functions, 
one  of  which  depends  on  y alone  and  the  other  on  x alone. 

We  rewrite  the  equation  thus: 

= Y(»)  dx 
<p  (y) 

Integrating  the  right  and  left  members,  we  get 

^^{x)dx+C  (4) 

(we  write  only  one  arbitrary  constant  because  both  constants  that 
appear  when  evaluating  the  integrals  can  be  combined  into  one).  From 
this  general  solution  of  equation  (3)  we  get  the  particular  solutions  by 
assigning  all  possible  values  to  C.  We  see  that  in  the  general  solution 
of  (1)  there  is  one  arbitrary  constant,  which  is  in  accord  with  the  exis- 
tence of  one  degree  of  freedom  in  the  choice  of  the  particular  solution 
(Sec.  7.1). 

It  is  easy  to  find  C if  we  have  the  extra  initial  condition  (2).  For 
this  purpose  we  write  (4),  for  brevity,  thus: 

<D(y)  = Y(x)  + C 


Equations  of  this  type  are  found,  for  example,  in  the  problem  of  radioactive 
decay  and  in  the  problem  of  water  discharge  from  a vessel  (see  HM,  Ch.  5). 
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Setting  x = x0,  y = y0,  we  get 

= T(*o)  + C 

whence 

C = <D(y0)  - T(*0) 

and,  finally, 

O(y)  = ¥(*)  + <D(y0)  - ¥(#0) 

that  is, 

®(y)  - *(y0)  = Y(*)  - r(*0) 


The  particular  solution  thus  found  can  also  be  written  as 

y0  *0 

It  is  immediately  apparent  that  this  solution  satisfies  the  condition 
(2).  We  obtain  the  desired  solution  by  carrying  out  the  inte- 
gration. 

II.  Homogeneous  linear  equations.  Suppose  we  have  a homoge- 
neous linear  equation  of  the  first  order: 

¥ =f(x)y  (5) 


This  equation  is  a particular  case  of  equation  (3),  but  we  will  examine 
it  in  more  detail  because  of  its  great  importance.  Separating  variables 
in  (5)  and  integrating,  we  get 

X 

— = f{x)  dx , In  y = ^f{x)  dx  + In  C 


On  the  right  we  write  the  arbitrary  constant  in  the  form  In  C for 
convenience  of  manipulation.  From  this  we  find  y: 

X 

£ /(*)  dx 

y = Cea  (6) 

where  a is  a fixed  value  of  x.  Formula  (6)  provides  the  general  solution 
of  (5). 

For  brevity  put 

X 

J /(*)  dx 

y^x)  = * (7) 

Since  this  function  is  obtained  from  (6)  for  C = 1,  it  is  a particular 
solution  of  (5).  Formula  (6)  may  be  written  as 

y = cyi(x) 


(8) 
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It  is  also  easy  to  verify  directly  that  if  y^x)  is  some  particular  solu- 
tion of  equation  (5),  then  the  function  (8)  for  any  constant  C also 
satisfies  (5): 


dy  _ d(Cyx)  £ dy, 
dx  dx  dx 


cf(x)yi  =Ax)y 


Thus,  in  order  to  obtain  the  general  solution  of  equation  (5), 
take  any  particular  solution  and  multiply  it  by  an  arbitrary  constant. 
Setting  C = 0,  for  instance,  we  see  that  one  of  the  particular  solutions 
of  (5)  is  identically  zero;  this  zero  solution  is  of  course  not  suitable 
for  constructing  the  general  solution. 

III.  Nonhomogeneous  linear  equations.  Suppose  we  have  the 
first-order  nonhomogeneous  linear  equation 

%-=A*)y  + g(*r  (9) 

dx 


We  seek  the  solution  of  (9)  that  vanishes  for  some  value  x = x0 . 
For  a fixed  function  f(x),  this  solution  y(x)  is  determined  by  the  choice 
of  the  function  g(x),  that  is,  g(x)  may  be  interpreted  as  a kind  of  exter- 
nal action  and  y{x)  as  its  result  (in  other  words,  the  law  by  which  the 
function  g(x)  is  associated  with  the  solution  y{x)  is  an  operator,  see 
Sec.  6.2).  It  is  easy  to  verify  that  the  principle  of  superposition 
holds  true  here ; which  means  that  if  the  functions  g(x)  can  be  added, 
so  also  can  the  results.  Indeed,  if 

~ = /(*)  yi  + gi(x),  = f{x)  y2  + gz(x) 

dx  dx 

and  yx{x 0)  = 0,  y2(#0)  = 0,  then  the  function  y = yx(x)  + y 2(#) 
satisfies  the  equation 

y = f(x)  y + Lgl(x)  + &(*)] 

dx 

and  the  condition  y(x0)  = 0 (why?). 

On  the  basis  of  Sec.  6.2,  the  solution  of  equation  (9)  may  be 
obtained  by  constructing  the  appropriate  influence  function  G(x , !;), 
which  serves  as  a solution  of  the  equation 

fx=/(x)y+8(x-l)  (10) 

for  any  fixed  Let  us  assume  that  % > x0;  then  for  x0  < x < !; 
the  equation  (10)  becomes  (5),  which  means  the  solution  is  of  the 
form  (8),  but  since  a solution  is  being  sought  for  which  y(x0)  = 0, 


This  kind  of  equation  occurs  for  example  in  the  problem  of  radioactive  decay 
(see  HM,  Ch.  5).  Here,  the  independent  variable  x is  the  time,  and  the  function 
y is  the  amount  of  radioactive  material  in  the  system,  so  that  the  desired  solu- 
tion y(x)  describes  the  law  of  variation  of  quantity  in  time.  The  coefficient  f(x) 
is  equal  to  a negative  constant  whose  absolute  value  is  the  probability  of  decay 
of  the  given  substance  in  unit  time,  and  the  free  term  g(x)  is  equal  to  the  rate 
at  which  this  substance  is  introduced  into  the  system  at  hand. 
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Fig.  82 


it  follows  that  C = 0,  or  y(x)  = 0.  If  we  also  integrate  (10)  from 
x = \—  0to#  = i;  + 0>  then  we  get 

5+o  5+o 

y{l  + 0)  — y(?  — 0)  = ^ f(x)  ydx+  ^ 8(x  — %)dx  = 0 - f 1 = 1 
5-o  5-0 

(since  the  solution  y is  a finite  solution,  the  integral  of  the  first  term 
on  the  right  of  (10)  over  an  infinitesimal  interval  is  infinitely  small 
and  can  therefore  be  disregarded).  But  according  to  what  has  just 
been  proved,  yife  — 0)  = 0,  whence 

y(l  + o)  = i (ii) 

However,  for  x > \ equation  (10)  also  becomes  (5)  and  therefore  has 
the  solution  (8);  the  condition  (11)  yields 

1 =Cyl(Z),  or  C = j—  and  y^-^—y^x) 

y i(5)  y.(5) 

Thus,  in  the  given  problem  Green’s  function  is  of  the  form 

f 0 (x0  < # < £) 


G(x,  5)  = 


y,(5) 


(*>5) 


The  graph  of  the  function  G(x,  £)  for  fixed  £ is  shown  in  Fig.  82  by 
the  heavy  line,  which  has  a discontinuity  at  x = 5* 

We  can  now  write  the  desired  solution  of  (9)  for  any  function 
g(x)  on  the  basis  of  the  appropriately  transformed  general  formula 
(8)  of  Ch.  6: 


CO  * 00 

y(x)  = ^G(x,  l)  g{l)  = $G(*.  1)  g(l)  dl  + J G(x,  l)  g®  d\ 

x0  xa  X 


(12) 
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where  the  function  yx{x)  is  given  by  formula  (7).  In  a similar  manner 
it  can  be  verified  that  the  same  final  formula  (12)  holds  true  also  for 
x < x0.  Note  that  in  the  right-hand  member  we  can  take  yx(x)  (but 
not  yx(5;)!)  out  from  under  the  integral  sign. 

Particularly  frequent  use  is  made  of  the  last  formula  for  x0  = 
= — oo.  Physically  speaking,  this  is  the  most  natural  one,  for  the 
solution  (12)  has  the  property  that  if  the  function  g(x)  is  identically 
equal  to  zero  up  to  some  x,  then  the  solution  too  is  identically  zero 
up  to  this  *.  Thus,  in  this  case  the  formula  (12)  yields  a sort  of  “pure” 
result  of  the  action  of  the  function  g(x). 

Formula  (12)  gives  a particular  solution  of  equation  (9):  that 
solution  which  vanishes  for  x = x0.  To  obtain  the  general  solution 
of  (9),  note  that  the  difference  of  any  two  solutions  of  this  equation 
satisfies  the  appropriate  homogeneous  equation  (5) : if 

~ - f(x)  yx+g(x),  dy  = f(x)  y2  + g(x) 

ax  ax 

then 

d{y' ~ n)  = Rx)  (yi-y2) 

ax 

(why?).  Hence  this  difference  must  be  of  the  form  (8),  where  C is 
arbitrary.  To  summarize,  the  general  solution  of  a nonhomogeneous 
linear  equation  is  the  sum  of  any  particular  solution  and  the  general 
solution  of  the  corresponding  homogeneous  equation.  Choosing  for 
this  particular  solution  the  solution  (12)  that  we  found,  we  obtain 
the  general  solution  of  (9)  in  the  form 

X 

y = \~yA§rg{l)dl  + Cy  x{x)  (13) 

J yi(Z) 

*o 

If  we  are  interested  in  the  particular  solution  that  satisfies  condition 
(2),  then,  substituting  x = x0  into  (13),  we  get 

y0  = 0 + Cyx{x 0),  or  C = 

y i(*o) 

and  finally  we  get  the  desired  solution 


* 


*o 


The  same  results  can  be  attained  in  a different  way,  with  the 
aid  of  a faster  but  artificial  device  called  the  variation  of  constants 
(parameters).  The  celebrated  French  mathematician  and  mechanician 
Lagrange  proposed  seeking  the  solution  of  equation  (9)  in  the  form 
(by  proceeding  from  (8))  of 

y = «(* ) yi(*) 


(14) 


234  Differential  equations  CH.  7 

where  u(x)  is  some  unknown  function  and  }'i(x)  has  the  earlier  sense 
of  (7).  Substituting  (14)  into  (9),  we  get 

uy[  + u'y1  = f(x)  uy1  + g(x) 

But  since  y1  is  a solution  of  a homogeneous  equation,  the  first  terms 
on  the  left  and  right  cancel  out,  and  we  get 

X 

u'y^x ) = g(x),  whence  u'  = , u(x)  = + C 

Vi's)  J y,(s) 

*0 

(in  the  last  integral  we  changed  the  variable  of  integration  to  £ in 
order  to  distinguish  it  from  the  upper  limit).  Substituting  the  result 
into  (14),  we  arrive  at  formula  (13). 

IV.  Simple  cases  of  linear  equations.  There  are  cases  where 
the  solution  of  a linear  equation  appears  to  be  particularly  simple, 
as,  for  instance,  if  the  coefficient  f{x)  is  constant : 

f(x)  = p ~ constant 

Then  the  homogeneous  equation  (5)  is  of  the  form 


and  the  general  solution  is 

y = Cepx  (15) 

This  can  be  obtained  from  (7)  and  (8),  assuming  for  the  sake  of 
simplicity  a = Q,  but  it  can  also  be  obtained  easily  and  directly  by 
means  of  separation  of  variables  or  simple  substitution. 

The  corresponding  nonhomogeneous  equation  is  particularly 
simple  to  solve  if  the  free  term,  g(x),  is  a constant  or  an  exponential 
function.  First  consider  the  equation 

— = p y -f  A {A  = constant)  ( 1 6) 

dx 

As  we  know,  to  find  the  general  solution  it  is  necessary  to  find  some 
particular  solution  and  then  add  to  it  the  general  solution  (15)  of 
the  corresponding  homogeneous  equation.  However,  a particular 
solution  of  (16)  can  easily  be  found  in  the  form  y = B = constant. 
Substituting  into  (16),  we  get 

0 =PB  + A,  or  B = — - 

P 

Thus,  the  general  solution  of  equation  (16)  is  of  the  form 

y = --+  Cepz 
P 
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Now  consider  the  equation 

**  =py  + Aekx  (17) 

dx 


The  derivative  of  an  exponential  function  is  proportional  to  the 
function  itself,  and  so  it  is  natural  to  seek  the  particular  solution 
of  (17)  in  the  form 

y = Bekx  (18) 

because  then,  after  substitution  into  (17),  all  terms  will  be  similar 
and  we  can  hope  to  attain  equality  by  a proper  choice  of  B . We  then 
have 

Bkekx  = pBekx  + Aekx 

A 

whence  it  is  easy  to  find  B = . Substituting  into  (18),  we  get 

k — p 

a particular  solution  of  equation  (17)  and,  after  adding  the  general 
solution  of  the  corresponding  homogeneous  equation,  we  get  the  gene- 
ral solution  of  (17): 

v ekx  + Cepx 

k — p 

where  C is  an  arbitrary  constant. 

This  solution  is  clearly  not  suitable  if  k = p.  In  this  special  case 
we  find  the  solution  of  (17)  with  the  aid  of  the  general  formula  (13), 
noting  that  in  the  case  at  hand  the  particular  solution  y^x)  of  the 
homogeneous  equation  is  equal  to  epx.  Taking  x0  = 0 for  the  sake 
of  simplicity,  we  get 

X X 

y = J ^Ae^dl  + Cepx  = Aepx^dl  + Cepx  = Axepx  + Cepx  (19) 
0 0 

Thus,  for  k — pt  we  have  the  supplementary  factor  x in  the  exponen- 
tial in  the  particular  solution. 

The  general  nonhomogeneous  equation 

~=py  + g{x) 

dx 


is  solved  with  the  aid  of  Green's  function,  which  in  this  case  is  parti- 
cularly simple : 

f 0 (x0<x<  l) 


G(x,  l)  = 


ePl 


(*>5) 

The  general  solution,  because  of  formula  (13),  is  of  the  form 


X 

y = ?>£($)  dt  + Cepx 
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The  equations  considered  here  do  not  exhaust  the  types  of 
equations  whose  solutions  can  be  written  down  in  the  form  of  exact 
formulas  involving  elementary  functions  and  integrals  of  them* 
Several  other  such  types  can  be  found  in  texts  on  differential  equa- 
tions (they  are  fully  covered  in  Kamke  [10]). 

Exercises 

Find  the  solutions  to  the  following  differential  equations  that 
satisfy  the  indicated  initial  data. 

1.  — = 2 xy,  y = 1 for  # = 0. 

dx 

2.  *?.  = *,  y=  i for  *=  i. 

dx  x 

3.  — — e~y , y = 1 for  x = 0. 

dx  J 

4.  — = — , y = 1 for  x — 0. 

dx  y 

5.  = — y + ex,  y = — for  a;  = 1. 

dx  e 

6.  — = — 2 y -j-  4x,  y = —2  for  x = 0. 

dx 

7.  — + y = cos  x,  y — ~ for  x = 0. 

dx  J J 2 

7.3  Second-order  homogeneous  linear  equations  with  constant 
coefficients 

A differential  equation  of  the  second  order  contains  the  second 
derivative  of  the  desired  function  and  has  the  general  aspect 


If  this  equation  depends  linearly  on  the  desired  function  and  its  deri- 
vatives (the  dependence  on  x may  be  of  any  kind),  it  is  called  a linear 
equation . Thus,  a second-order  linear  equation  has  the  general  form 

«(*)  -77  + Hx)  ~ + c(*)  y = /(*) 

dx 2 dx 

We  will  consider  the  most  important  special  case  where  the 
coefficients  a,  by  c of  the  desired  function  and  its  derivatives  are 
constant.  This  equation  is  encountered  in  problems  involving  mecha- 
nical vibrations  and  electric  oscillations  (see  HM,  Chs.  6 and  8) ; the 
independent  variable  in  these  problems  is  the  time  t.  Let  us  examine 
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the  mechanical  vibrations  of  the  most  elementary  type  of  oscillator. 

The  equation  then  has  the  form 

m^  + hYt+ky=m  (20) 

where  y is  the  deviation  of  an  oscillating  point  from  the  position  of 
equilibrium,  m is  its  mass,  h is  the  coefficient  of  friction  (the  friction 
is  assumed  to  be  proportional  to  the  velocity),  k is  the  coefficient  of 
elasticity  of  the  restoring  force  (which  force  is  assumed  to  be  proportio- 
nal to  the  deviation),  and  f(t)  is  an  external  force.* 

We  first  consider  the  equation  of  free  oscillations,  which  has  the 
form 

mj^+h*+ky=°  <2,) 

and  is  called  a homogeneous  linear  equation.  In  many  respects,  its 
properties  are  similar  to  those  of  the  homogeneous  linear  equation 
of  the  first  order  that  was  discussed  in  Sec.  7.2.  For  instance,  it  is 
easy  to  verify  that  if  yx{t)  is  a particular  solution  of  (21),  then  so  also 
is  Cyx(t),  where  C is  any  constant.  In  particular,  for  C = 0 we  get 
the  identically  zero  solution  of  equation  (21).  Besides,  it  is  easy  to 
verify  that  if  yx(t)  and  y2(t)  are  two  particular  solutions  of  (21),  then 
their  sum,  y — yx(t)  + y2(0»  al so  a solution  of  that  equation 
(check  this  by  substituting  this  sum  into  (21)). 

From  the  foregoing  it  follows  that  if  we  have  found  two  particular 
solutions  yx(t)  and  y2(t)  of  equation  (21),  then  their  linear  combination , 

y = ciyS)  + C2y2{t)  (22) 

where  Cx  and  C2  are  arbitrary  constants,  is  also  a solution  of  that 
equation.  But  the  general  solution  of  a second-order  differential  equa- 
tion is  obtained  via  two  integrations  and  for  that  reason  contains 
two  arbitrary  constants.  This  means  that  the  expression  (22)  serves 
as  the  general  solution  of  equation  (21).  Of  course,  yx{t)  and  y2{t) 
should  not  be  proportional  here,  since  if  y2(t)  = kyx{t)y  then 

CiTiW  + C2y2{t)  = (Ci  + C2k)  yi{t)  = Cyx{t) 

which  is  to  say  that  actually  there  is  only  one  arbitrary  constant  here 
(the  condition  that  the  parameters  be  essential  is  not  fulfilled,  cf. 
Sec.  4.8). 

How  do  we  find  two  “independent”  solutions  of  equation  (21)? 
Euler,  by  proceeding  from  the  property  of  an  exponential  being  pro- 
portional to  its  derivatives,  proposed  seeking  the  particular  solutions 
in  the  form 

y = evt  (23) 


The  reader  will  do  well  to  review  the  results  of  HM,  Ch.  6.  Here  the  exposition 
is  more  complete  and  will  be  conducted  from  a broader  base. 
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where  p is  a constant  that  has  to  be  found.  Since  in  such  a choice 
^ = pept  and  ^ V1*,  aiter  substituting  into  (21)  and  cancel- 

ling out  ept,  we  get,  for  a determination  of  p , the  quadratic  equation 
(the  so-called  characteristic  equation) 

mp2  + hp  + k = 0 (24) 

As  we  know  from  algebra,  in  solving  a quadratic  equation  there 
may  be  different  cases,  depending  on  the  sign  of  the  discriminant: 

D = h2  — Amk 

If  the  friction  is  great,  to  be  more  exact,  if  h 2 > Amk , then  (24) 
has  real  roots: 

A _ - k ± fh?  - Amk 
2 ~ ~ 2 M 

Denote  them  by  px  — —a  and^>2  = — since  they  are  both  negative. 
Then,  on  the  basis  of  (22)  and  (23),  we  get  the  general  solution  of  (21) 
in  the  form 

y = Cjfi-*  + C2e-»  (25) 

where  Cx  and  C2  are  constants.  Thus,  in  the  case  of  considerable 
friction,  the  deviation  of  a point  from  the  equilibrium  position  tends 
to  zero  exponentially  with  t without  oscillations. 

If  the  friction  is  small,  to  be  more  exact,  if  li 2 < Amk , then  (24) 
lias  imaginary  conjugate  roots: 

, h . .1  [~k  h*  . • 

= — - **  = - Y±*« 

where  y = Having  in  view  (23)-  we  get  the 

general  solution  (22): 

y = C1e-Y,+i“*  + C2e~'<i-ii*t  = £“Y*(CVit0*  + C2c~icit)  (26) 

As  was  determined  in  Sec.  5.4,  the  multipliers  eicit  and  e are 
periodic  functions  with  period  T = — . For  this  reason,  co  is  the 

CO 

circular  frequency.  The  multiplier  characterizes  the  rate  of 
decay  of  oscillations.  The  expression  (26)  will  be  a solution  of  equa- 
tion (21)  for  any  constants  C\  and  C2.  To  obtain  a real  solution,  take 

any  Cx  = — reif*  and  put  C2  = — re~ = CJ.  Then 
2 2 
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Taking  advantage  of  Euler's  formula,  we  get 

y = — y&— ^[cos  (c*)£  -(-  9)  i sin  (co t -f-  9)  d-  cos  (o^t  -f-  op) 

— i sin  (co£  + 9)]  = re-*  cos  (<*>£  + 9) 

The  solution  y — re-*  cos  (<otf  + 9)  is  real  and  contains  two  arbi- 
trary constants,  r and  9.  This  is  sometimes  written  differently,  the 
cosine  of  the  sum  being  expanded: 

y = re-*  (cos  cos  9 — sin  to/  sin  9) 

— (r  cos  9 ) e~*  cos  u>t  + (— r sin  9)  e~*  sin  corf 
= Cxe~*  cos  c orf  + C2e~ * sin  corf 

where  C1  and  C2  denote  the  appropriate  multipliers  in  brackets  (they 
are  not  equal  to  Cx  and  C2  in  formula  (26)!).  Here  the  real  indepen- 
dent solutions  of  equation  (21)  are  quite  apparent  (see  formula  (22)). 

Ordinarily,  this  transition  to  the  real  solution  is  carried  out  faster 
via  the  following  reasoning.  If  into  the  left  member  of  (21)  we  substi- 
tute a complex  function  of  the  real  argument  rf  and  carry  out  all  ope- 
rations, then,  by  virtue  of  the  properties  of  these  functions  given  in 
Sec.  5.5,  the  same  operations  will  be  carried  out  on  the  real  and  the 
imaginary  parts  of  this  function.  Therefore,  if  zero  results  from  per- 
forming these  operations  on  a complex  function,  then  we  get  zero 
after  performing  these  operations  on  the  real  part  of  the  function 
(and  on  the  imaginary  part  as  well).  Which  means  that  if  we  have 
a complex  solution  (23)  of  equation  (21),  then  the  real  and  imagi- 
nary parts  of  the  solution  are  also  solutions  of  (21). 

Note  that  the  last  assertion  holds  true  for  any  homogeneous 
linear  equation  (i.e.  an  equation  involving  y and  its  derivatives  linear- 
ly and  such  that  it  has  no  term  not  involving  y or  a derivative  of  y) 
with  real  coefficients.  Now  if  y and  its  derivatives  occur  nonlinearly 
in  the  equation,  this  assertion  is  no  longer  valid.  For  this  reason,  to 
take  an  example,  if  a quadratic  equation  has  two  imaginary  roots, 
then  the  real  and  imaginary  parts  of  the  roots  are  not,  separately, 
roots  of  this  equation. 

Thus,  when  the  friction  is  slight,  the  oscillations  decay  exponen- 
tially. Absence  of  friction  offers  a special  case:  h = 0.  In  this  case, 
the  characteristic  equation  (24)  is  of  the  form 

mp2  + k — 0 

whence 

pi,i = ± 

The  solution  of  the  differential  equation  has  the  form 

y = Cxe^  + C2e-itai=  r cos  (u>t  + 9)  where  (27) 
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What  this  means  is  that  in  the  system  at  hand  we  have  nondecaying 
harmonic  oscillations  with  arbitrary  amplitude  and  arbitrary  initial 


phase  and  a quite  definite  frequency  to  = | j ^ 

It  is  interesting  to  follow  the  behaviour  of  the  total  energy  of 
the  system.  It  is  easy  to  show  (see  HM,  Sec.  6.10)  that  this  energy 
has  the  expression 


E 


mv2  ky 2 1 

2 + 2 _ 2 


HD’  + iy] 


(28) 


The  first  term  on  the  right  is  equal  to  the  kinetic  energy,  the  second 
term,  to  the  potential  energy  of  the  oscillator.  Substituting  (27)  into 
(28)  for  a solution,  we  get 

E ~ ~ mo>2r2  sin2  (oo/  + <p)  + — kr 2 cos2  {<s>t  -f-  9)  = ~ kv2 


Thus,  in  the  case  of  h = 0 the  total  energy  remains  constant  and  all 
we  have  is  a “pumping”  of  kinetic  energy  into  potential  energy  and 
back  again. 

If  there  is  friction,  the  total  energy  of  the  system  diminishes, 
being  dissipated  (it  passes  into  heat,  which  is  not  taken  into  conside- 
ration in  the  differential  equation).  Differentiating  (28)  and  using 
the  differential  equation  (21),  we  get 

d^='-hm^^+lky^\ 

dt  2 l dt  dt 2 dt  J 

= ^(-h^-ky)+ky^  = -h  (*)2 

dt  V dt  ) dt  { dt ) 


The  derivative  is  negative  and  hence  E decreases. 

In  practical  problems  we  are  not  interested  in  the  general  solu- 
tion but  in  a specific  particular  solution.  Since  there  are  two  arbitrary 
constants  in  the  general  solution  of  a second-order  differential  equa- 
tion (that  is  to  say,  two  degrees  of  freedom),  a particular  solution  can 
be  chosen  only  if  we  specify  two  supplementary  relations  that  allow 
us  to  find  the  values  of  these  arbitrary  constants.  Such  supplementary 
conditions  are  usually  in  the  form  of  initial  conditions:  the  values 
of  the  desired  function  and  its  derivative  for  a certain  value  of  the 
independent  variable  are  specified. 

The  initial  conditions  for  (21)  consist  in  specifying  the  values 


y 


\t—t0 


= y 0. 


dy 

dt 


(29) 


that  is,  the  initial  deviation  and  the  initial  velocity  of  the  oscillating 
point.  From  physical  reasoning  it  is  clear  that  these  conditions  fully 
determine  the  process  of  oscillation.  This  can  readily  be  proved  mathe- 
matically as  well.  Consider,  say,  the  case  of  considerable  friction  when 
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the  general  solution  is  of  the  form  (25).  Differentiating  this  solution 
and  substituting  t — t0,  we  get,  on  the  basis  of  (29), 

Cy_fl'°  + C2e-b,°  = y0, 

— C^ae-^  — C2be-h,‘  = v0 

whence  for  C1  and  C2  wc  find  the  definite  values 

c = -Byo  + v_±  eaiQ  c2  = ay°  + v°  ebt • 

Putting  these  values  into  (25),  we  obtain  the  sought-for  particular 
solution  that  satisfies  the  conditions  (29): 

y = fcy°  + "0  eat°e~at  + ■aJ'°  + 1,(1  eb,>e~bt 

b — a a — b 


be-*-* o)  - , ^-a(<-/0) 

^ 


p-W-to) 


(30) 


Formula  (30)  enables  us  to  consider  the  intermediate  case  of 
h2  = Amk  that  we  have  not  yet  examined ; for  this  reason  the  charac- 
teristic equation  (24)  has  the  coincident  roots  p12  = — a . Then  the 
solution  (25)  is  not  the  general  solution,  since  C^at  + C2e~at=  (Cx  + 
+ C2)  e~at  = Ce~at,  for,  actually,  we  have  here  only  one  degree  of 
freedom,  not  two,  as  required.  If  in  (30)  we  pass  to  the  limit  as 
b a,  then,  taking  advantage  of  THospital's  rule  (see  HM,  Sec.  3.21). 
in  the  limit  we  get  the  solution 

y = -f  a(t  — /0)  e 4-  v0(t  — t0)  e -«(*-*•) 

= y0e-a«-u  + (ay0  + v0)  (t  — t0)  e -«(*-/.)  (31) 

We  see  that  in  the  solution  here  there  appears  a term  of  the  form 
te~at.  This  term  tends  to  zero  as  t increases,  since  the  exponential 
function  tends  to  zero  faster  than  any  power  of  t tends  to  infinity 
(see  HM,  3.21).  Hence,  here  again  we  have  decay  without  oscillations. 

If  one  notes  that  the  initial  conditions  are  arbitrary  and  regroups 
the  terms  in  the  right-hand  member  of  (31),  one  can  write  down  the 
general  solution  of  equation  (21)  in  the  given  intermediate  case  as 

y = -)-  C2te-at 

Exercises 

Find  solutions  to  the  following  differential  equations  that 
satisfy  the  indicated  initial  data. 


1-  y"  + y = 0,  y = 0,  y'  = —2  for  t = ~ • 

2.  4y"  — 8y'  + 5y  = 0,  y ==  0,  yf  *==  -1  for  t — 0. 


16-1634 
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3.  y " — 3 y9  + 2y  = 0,  y = 2,  y'  = 3 for  t = 0. 

4.  y"  — y = 0,  y = 4 , y'  = —2  for  t = 0. 

5.  y"  — 2y'  + y = 0,  y = 0,  y'  = e for  t = 1. 

6.  y"  -f  4y'  -j-  4y  = 0,  y = 1,  y'  =3  for  t = 0. 


7.4  A simple  second-order  nonhomogeneous 
linear  equation 

We  now  consider  the  equation  of  forced  oscillations , equation 
(20).  Note  first  of  all  that  the  reasoning  given  in  Sec.  7.2  on  the  rela- 
tionship of  the  solutions  of  a nonhomogeneous  equation  with  the 
solutions  of  the  corresponding  homogeneous  equation  hold  true: 
thus  the  general  solution  of  (20)  is  of  the  form 

y = Y(x)  + Ctf^x)  + C2y2(x) 

where  Y(x)  is  a particular  solution  and  C^y^x)  + C2y2(x)  is  the 
general  solution  of  the  corresponding  homogeneous  equation  (21). 

As  in  Sec.  7.2,  a particular  solution  to  equation  (20)  may  be 
constructed  with  the  aid  of  the  Green's  function.  First  let  us  consider 
the  simplest  case  of  h = k = 0,  that  is,  when  (20)  is  of  the  form 

(32) 

This  means  that  a body  is  being  acted  upon  by  a force  whose  depen- 
dence on  time  is  specified.  Let  us  find  a solution  that  satisfies  the 
zero  initial  conditions: 


y !»-«.  = °>  ” = ^ =o* 

at  t = t0 


As  in  Sec.  7.2  (see  the  beginning  of  Item  III),  we  see  that  the 
solution  y{t)  is  fully  determined  by  specifying  the  function  f(t),  and 
the  principle  of  superposition  applies.  Thus,  on  the  basis  of  the  gene- 
ral procedure  of  Sec.  6.2  we  can  construct  the  Green's  function  by 
solving  the  equation 


for  any  fixed  t and  for  the  initial  conditions  (33).  For  the  sake  of 
definiteness  we  assume  that  t > t0.  Then  over  the  interval  from  t0  to 
t the  right  side  of  (34)  is  equal  to  zero,  and  so  the  solution  is  also  zero 
for  the  zero  conditions  (33).  Integrating  the  equation  (34)  from 
t — 0 to  t + 0,**  we  get 


• Here  we  can  take  t0  — — oo,  which  is  physically  natural  for  many  problems 
(cf.  Sec.  7.2).  However,  we  leave  the  arbitrary  t0  for  what  follows,  since  for 
tQ  = — oo  it  is  inconvenient  to  specify  nonzero  initial  conditions. 

**  See  the  footnote  on  p.  218. 
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But  by  what  has  just  been  demonstrated  — =0,  whence 

dt  |t— 0 

^ =-L  (35) 

dt  T+o  m 

We  see  that  for  t = r the  derivative  — has  a finite  discon- 

dt 

tinuity  (a  finite  jump) ; and  so  the  function  itself  y(t)  does  not  have  a 
discontinuity  at  t = t,  that  is, 

y jr-fo  = y |t-o  = 0 (36) 

But  for  t > t equation  (34)  is  of  the  form  m = 0,  or  — 0, 

and  we  have  to  find  the  solution  of  this  equation  for  the  initial  con- 
ditions (35)  and  (36).  It  is  not  difficult  to  verify  directly  that  all 

these  requirements  are  satisfied  by  the  function  y = — (t  — t). 

m 

We  thus  have  the  Green's  function  of  the  problem  at  hand : 


G(t,  t)  = 


0 (^o  < t < t), 

(r  <t  <oa) 


(The  fact  that  this  continuous  function  with  a “corner"  at  t = x 
satisfies  equation  (34)  also  follows  from  the  considerations  of  Sec.  6.3 ; 
see  Fig.  76  which  depicts  a function  whose  second  derivative  is  equal 
to  8(x).  By  virtue  of  the  general  formula  (8)  of  Ch.  6 applied  to  the 
given  case,  we  obtain  the  desired  solution  of  equation  (32)  for  the 
initial  conditions  (33) : 

CO 

y{t)  = ( G(t,  t)/(t)  d- 

t oo  t 

= J G(t,  t)  /( T)  dz  + ( G(t,  T) /(t)  d^  = ± J (/  - T)  /(t)  dr  (38) 

This  completes  the  derivation  of  the  formula  for  this  solution, 
and  the  reader  can  go  straight  to  Sec.  7.5  where  we  discuss  the 
general  equation  of  forced  oscillations.  However,  we  have  a few  more 
remarks  to  make  with  respect  to  the  particular  case  we  are  now 
discussing. 

The  Green's  function  (37)  and  also  formula  (38)  may  be  obtained 
directly  by  physical  arguments.  Equation  (34),  which  defines  the 
Green's  function,  signifies  that  a body  which  at  the  initial  time  was 
located  at  the  origin  and  was  not  in  motion  was  acted  upon  at  time 
t by  an  instantaneous  force  with  unit  impulse  (see  the  end  of 


244  Differential  equations  CH.  7 

Sec.  6.1).  But  after  the  action  of  the  brief  force  the  body  moves  with 
a constant  velocity  equal  to  the  ratio  of  the  impulse  to  the  mass  of  the 
body  (see  HM,  Sec.  6.5),  which  in  this  case  is  1/m.  For  this  reason, 
y(t),  or  the  path  traversed,  is  expressed  just  by  formulas  (37). 

Formula  (38)  may  be  obtained  without  invoking  the  Green's 
function,  although,  actually,  the  method  is  the  same.  One  imagines 
the  force  / over  the  interval  t0  to  t as  a sequence  of  short-term  forces, 
each  of  which  operates  over  a certain  interval  from  r to  t + dx  and 
therefore  has  the  impulse  / (t)  dx.  If  this  short-term  force  were  acting 

alone,  then  the  body  would  attain  a velocity  and  by  time  t 

vn 

would  traverse  a path 

i (t  - t)  (39) 

m 

But  due  to  the  linearity  of  equation  (32),  we  have  the  principle  of 
superposition  (in  other  words,  the  principle  of  addition  of  motions, 
according  to  which  the  laws  of  motion  are  additive  when  several 
forces  are  superimposed  on  one  another).  Therefore,  the  results 
(39)  must  be  added  over  all  r from  tQ  to  t;  we  thus  arrive  at  formula 
(38). 

Now  let  us  derive  the  formula  (38)  in  a different  way,  by  a two- 
fold integration  of  (32).  The  first  integration  yields 

t 

m[y'{t)  — y'(t0 )]  = (/(*)  dt 

u 

or,  taking  into  account  the  second  condition  of  (33), 

t 

my’ (t)  = ^/(£)  dt 

In  this  formula  the  variable  of  integration  is  denoted  by  the  same 
letter  t as  the  upper  limit.  Ordinarily,  this  does  not  cause  any  confu- 
sion, but  it  is  more  convenient  here  to  use  the  more  accurate  notation 

t 

*ny'(t)  = J|/(t)  dz  (40) 

in  which  we  can  strictly  distinguish  between  the  variable  of  integra- 
tion t and  the  upper  limit  t:  the  velocity  of  the  body  at  time  t 
depends  on  the  values  of  the  force  at  all  preceding  instants  t,  which 
means  it  depends  on  /( t),  where  t assumes  all  values  from  t0  to  t. 
(This  distinction  is  absolutely  necessary  in  formula  (38),  where  the 
difference  t — t appears  in  the  integrand.) 
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Fig.  83 


Again  integrating  (40),  we  obtain,  with  account  taken  of  the 
first  condition  of  (33), 


t 


my  (t)  = ^ dtx 


u 


dtx 


We  have  a twofold  iterated  integral  (cf.  Sec.  4.7)  in  which  the  varia- 
ble tx  of  the  outer  integration  varies  from  tQ  to  t , and  for  each  tx  the 
variable  t of  inner  integration  varies  from  t ~ t0  to  t = tx.  Thus 
the  integration  is  performed  over  the  triangle  (a)  shown  in  Fig.  83. 
But,  as  we  know,  a double  integral  can  be  evaluated  in  the  reverse 
order,  first  with  respect  to  tx  and  then  with  respect  to  t.  Then  the 
integration  with  respect  to  tx  will  be  carried  out  for  fixed  t from  tx  — t 
to  tx  = t (see  Fig.  83)  and  the  outer  integration  with  respect  to  t 
from  t0  to  t . We  then  get 

/ t 

my(t)  = ^ ^/(T)  dtx  = ^ dx  ^/(t)  dtx 

(o)  t9  X 

But  we  can  take /( t)  out  from  under  the  inner  integral  sign:  it  does 
not  depend  on  the  variable  of  integration  tv  so  that 

t t t 

my(t)  = ^ <fx/(x)  = ^<fx  • /(x)  (t  — x) 

»o  f <» 

and  we  arrive  at  formula  (38). 

Let  us  verify  that  (38)  does  indeed  yield  the  solution  of  this 
problem.  It  will  also  be  a useful  exercise  in  differentiation.  We  will 
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proceed  from  two  formulas.  The  first, 


d_ 

dt 


6 


<p(s)  <*s) 


/ 


= <pW 


(V) 


(the  derivative  of  the  integral  with  respect  to  the  upper  limit)  is 
familiar  from  integral  calculus ; the  second  formula, 


~\iF{Syt)  rfsj  = ^ F[ (s,  t)  ds  (a,  b = constant) 

' a 'a 


(42) 


(the  derivative  of  the  integral  with  respect  to  a parameter)  was  given 
in  Sec.  3.6.  But  how  do  we  find  the  derivative  of  the  right-hand 
side  of  (38),  where  t appears  both  as  a limit  of  integration  and  as 
the  parameter  under  the  integral  sign?  To  do  this  we  have  to  take 
the  sum  of  two  terms,  one  of  which  is  obtained  by  differentiating  the 
integral  (38)  for  fixed  t under  the  integral  sign,  and  the  other  for  fixed 
/in  the  upper  limit  (cf.  HM,  Sec.  3.4).  The  differentiation  is  performed 
by  formulas  (41)  and  (42): 


dy 

dt 


= - (<-t)/(t) 


T=* 


t)/(t)]  dr  = ±if(r)dr  (43) 

m J 
^0 


We  use  (41)  in  the  second  differentiation: 


d2y 
dt 2 


1 

m 


m 


This  implies  that  equation  (32)  is  satisfied.  The  conditions  (33)  hold 
if  we  put  t = t0  in  (38)  and  (43). 

By  way  of  an  illustration,  let  us  find  the  law  of  motion  of  a body 
acted  upon  by  a force  proportional  to  the  time  elapsed  from  the  start 
of  motion.  In  this  case,  f(t)  = a(t—  t0)  and  so /( t)  = a(r~t0).  We  get 

t i 

y(t)  = -((<  — T)  a(-  — t0)  = — — t0) 

m J m J 

- (t  - t0)]  (t  - t0)  dr  = — (t  /„) 

m 2 

a_  (j  ~ <o)3  _ a(<  ~ <o)3 

m 3 6m 


We  solved  the  problem  with  zero  initial  data.  This  is  the  most 
important  part  of  the  job  because  if  we  can  solve  a problem  with 
zero  initial  data,  then  the  solution  of  the  problem  with  arbitrary  ini- 
tial data,  y = y0,  v = v0 , for  t = t0  offers  no  difficulties  at  all.  Indeed, 
suppose 

m -§-  = /(/).  y = y0>  v = v0  for  t = t0 
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Then 

y(t)  = yW{t)  + y&)(t) 

where  v(1)(<)  is  the  solution  of  the  problem  with  zero  initial  data. 

™ ^ =M,  yU(to)  = 0,  ^ | = W(t0)  = 0 


dt  \t=tm 


and  v{2)  (t)  is  the  solution  without  a force: 

m ~yS~ = °*  = = v<2>(l°) = v° 

dt * dt  t=t0 

(The  reader  can  easily  verify  that  v = + j>,(2)  is  a solution 

of  this  problem.)  It  is  easy  to  find  the  function  yW(t):  y&H  = 
= v0{t  — t0)  + y0  and  so 
* 

y(t)  = — W/  — x)f(x)  dx  + v0(t  — t0)  + y0 
m i 


Let  us  investigate  the  solution  (38)  which  corresponds  to  zero 
initial  conditions ; for  simplicity  we  assume  that  t0  = — oo.  We 
then  write 

t t t 

y{t)  = ± ( (*-  t)/(t)  dx  = ± f tf(x)  dx-±[  t/(t)  dx 

m J m J m J 

— OO  — 00  — oc 

Since  t is  not  the  variable  of  integration,  we  can  take  it  outside  the 
integral  sign,  and  so 

t t 

y{t)  = — ( /(t)  dx  — — ( t/(t)  dx 
m J m j 

— oo  — CO 

On  the  basis  of  (43)  we  can  rewrite  this  formula  thus: 


y(t)  = tv(t)  — — C t/(t)  dx 

m J 


Factor  out  v(t)  to  get 


y{t)  = v{t)  • (t  - 9) 


where 


1 r r 

— \ t/(t)  dT  \ t/(t)  - 

m J J 


* 
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Fig.  84 


The  formulas  written  in  this  fashion  are  particularly  convenient  if 
the  force  ceases  to  act  after  a time  interval.  For  times  t after  the  force 

t t 

has  ceased  to  act,  the  integrals  ^ t/(t)  dz  and  ^ /( t)  dr  no  longer 

— OO  — 00 

depend  on  t.  An  increase  in  t in  these  integrals  only  leads  to  an  increase 
in  that  portion  of  the  region  of  integration  where  the  integrand  is 
zero.  After  the  force  ceases  to  act,  the  body  moves  with  a constant 
velocity  v = vteT  and  the  quantity  0 = 0ter  is  also  constant.  There- 
fore, after  the  action  has  ceased,  the  graph  of  y(t)  is  a straight  line: 

y ^ter  * ®tcr) 

The  quantity  0ter  is  the  abscissa  of  the  point  of  intersection  of  this 
straight  line  with  the  £-axis  (Fig.  84).  The  physical  meaning  of 
0ter  is  this:  if  a body  begins  to  move  at  time  t = 0ter  with  velocity 
v = vter,  then  it  will  move  via  the  same  law  as  a body  actually  moves 
after  the  force  ceases  to  act. 

Exercises 

Find  solutions  to  the  following  differential  equations  that  satisfy 
the  indicated  initial  data. 


1. 

d2X 

dt2 

= 0, 

x{2)  - 1, 

x'(2)  = -3. 

2. 

d2x 

dt2 

= 1. 

*(0)  = -2, 

o 

II 

g 

3. 

d2x 

dt 2 

-=  sin  t, 

x(0)  = 0, 

x'{0)  = 1. 

4. 

d2x 

dt 2 

= e\ 

x(—  oo)  = 0, 

x’{-  oo)  = 0. 
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7.5  Second-order  nonhomogeneous  linear  equations 
with  constant  coefficients 

Green's  function  can  be  applied  to  the  general  equation  (20) 
which  describes  the  motion  of  an  elastically  attached  body  under  the 
action  of  an  external  force  dependent  solely  on  the  time  in  the  pre- 
sence of  friction  that  is  proportional  to  the  velocity. 

As  in  Sec.  7.4,  we  seek  the  solution  for  the  zero  initial  conditions 
(33).  To  construct  the  Green's  function,  it  is  necessary,  as  in  the  case 
of  Sec.  7.4  (see  (34)),  to  solve  the  equation 

m^dfi+h^di  + ky=*(-t~T'>  (44) 


under  the  initial  conditions  (33).  Assuming  t > t0,  we  find  that 
y(t)  ~0  for  t0  < t < t;  and  integrating  (44)  from  / = t — 0 to  / = 
= t + 0,  we  arrive  at  the  same  conditions  (35)  and  (36),  since  the 
integrals  of  the  finite  second  and  third  terms  in  (44)  are  zero.  Thus, 
for  t > r,  it  is  required  to  solve  the  homogeneous  equation  (21)  for 
the  initial  conditions  (35)  and  (36).  Proceeding  from  the  general  solu- 
tion of  equation  (21), 

y — Cxepit  + C2ePit 

where  px  and  p2  are  roots  of  the  characteristic  equation  (24),  and 
reasoning  as  we  did  at  the  end  of  Sec.  7.3,  we  obtain  the  desired 
solution 


~P  iT 


y = 


MPi  - p2) 


ePx*  4 e — ep*  = X- \ep^)  — ep^^ 

‘ m(p2  — p\)  miPi—pz)  1 J 


This  solution  is  suitable  both  for  a lesser  and  greater  friction. 
We  thus  obtain  the  Green's  function: 


G(t,  t)  = 


0 


{to<t<  T)» 


[^l«-T)  __  ep,it- T)]  (x  < t < oo) 

™{pl—pi) 


(As  under  the  conditions  of  Sec.  7.4,  this  function  is  continuous  but 
has  a salient  point  at  t = t.)  From  this,  like  (38),  we  obtain  the  solu- 
tion of  equation  (20)  under  the  zero  initial  conditions  (33) : 

t 

y(t)  = h-  \ [^'('-T)  - ^-<‘-)]/(t)  dx  (45) 

m(p1  — p2)  J 
*0 

As  in  Sec.  7.2  (Item  IV),  equation  (20)  can  be  solved  without 
any  Green's  function  for  an  external  load,  particularly  of  a simple 
type.  This  occurs  if  f(t)  = constant,  that  is,  if  we  consider  the  equation 

m + A — + ky  = A (=  constant)  (46) 

dt*  dt' 
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It  is  easy  to  find  a particular  solution  of  the  form  y = B = constant. 
Substituting  into  (46),  we  get 

0 + 0 + kB  = A,  or  B = — 

k 

Taking  note  of  the  remark  made  at  the  beginning  of  Sec.  7.4,  we  get 
the  general  solution  of  equation  (46): 

y = - + + C2ee>‘  (47) 

k 


where  C1  and  C2  are  arbitrary  constants  determined  from  the  initial 
conditions.  In  Sec.  7.3  we  say  that  the  solution  of  the  homogeneous 
equation  (21)  tends  to  zero  as  t increases,  since  px  and  p2  are  either 
negative  real  or  imaginary  with  a negative  real  part ; thus,  from  (47) 
we  get,  for  t large, 

y = 7 («s) 

k 

Physically  speaking,  this  result  is  clear.  Given  a constant  external 
force  and  friction,  the  oscillations  decay,  and  after  the  “transient 
process"  determined  by  the  initial  conditions  passes,  the  body  will 
stop  in  a position  where  the  elastic  force  ky  (with  sign  reversed)  will 
be  equal  to  the  external  force  A,  whence  we  get  (48).  This  stationary 
position  no  longer  depends  on  the  initial  conditions. 

The  solution  of  the  equation 

+ h — + ky  = AeQt  (49) 

dt2  dt 


is  also  simple.  Here  it  is  easy  to  find  a particular  solution  of  the 
form  y = BeQt.  Substituting,  we  get  mBq2eqt  + hBqeqt  + kBeqt  = 
A 

= Aeqt,  or  B = , whence  the  desired  particular  solution 

mq2  -f  hq  + k 

is  of  the  form 


y = 


AeQt 

mq2  + hq  + k 


(50) 


This  solution  is  unsuitable  if  q is  a root  of  the  characteristic  equation 
(24),  because  then  the  denominator  vanishes.  As  in  Sec.  7.2,  it  can 
be  shown,  by  proceeding  from  the  general  formula  (45),  that  in  this 
special  case  the  equation  (49)  has  a particular  solution  of  the  form 
Bteqt. 

Finally,  let  us  consider  the  solution  of  the  equation 
m + h — + ky  = A sin  u>t 

dt 2 dt  J 


(51) 
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Here  we  can  take  advantage  of  the  fact  that  by  virtue  of  Eulers 
formula  the  right  member  is  the  imaginary  part  of  the  function  Aeic*K 
Hence,  it  suffices  to  solve  the  equation 

m^~  + h — + ky  = Aeioii 
dt 2 dt 


and  to  take  the  imaginary  part  (see  similar  reasoning  in  Sec.  5.5). 
On  the  basis  of  (50)  we  get  the  complex  solution 


y = 


Aeio>* 

m(iio)2  -f  hia  -f  h 


A 

k — mu2  + 


eiut 


AUk  — rato2)  — Uuxi]  . , , . . ,v 

= — (cos  corf  + % sin  o >rf) 

(k  — mt&2)2  + A2w2 

» 

whence  it  is  easy  to  isolate  the  imaginary  part,  that  is,  a particular 
solution  of  (51): 

A 

y = [( k — wco2)  sin  c orf  — Aco  cos  corf]  (52). 

(k  — m<&2)2  + h2  to2 

In  order  to  find  the  general  solution  of  equation  (51)  we  have  to 
add  to  the  particular  solution  (52)  the  general  solution  of  the  corres- 
ponding homogeneous  equation.  But  since  each  of  the  solutions  of 
the  homogeneous  equation  tends  to  zero  as  rf  increases,  it  follows  that 
after  the  transient  process  the  body  will  begin  to  vibrate  by  the  har- 
monic law  (52)  that  does  not  depend  on  the  initial  conditions.  This 
steady-state  solution  could  have  been  found  using  the  methods  of 
Sec.  5.5. 

Let  us  investigate  the  case  where  friction  is  absent : that  is,  equa- 
tion (20)  is  replaced  by 

*»-$-  + *y=/W  (53> 


Here  the  characteristic  equation  has  the  roots  p J2  = ± where 


is  the  frequency  of  natural  oscillations  of  the  system  (that  is,  without 
an  external  force).  Formula  (45),  transformed  by  Euler's  formula* 
yields  a solution  for  the  zero  initial  conditions  (33) : 

t 

y = — 5 — f [y*>0(t-T)  — )]y(T) 

w2zo)0  J 


* 

= — — ( sin  c o0(t  — t)  /( t)  da 
mco0  J 


(54) 
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(Verify  by  means  of  differentiation  that  the  right  member  does 
indeed  satisfy  (53)  and  the  initial  conditions  (33).)  Using  the  formula 
sin  co0(£  — t)  = sin  c o0t  • cos  g>0t  — sin  co0t  • cos  c o0£ 
we  can  rewrite  the  solution  as 


cos  co0t  dr  — 


V-  ^/(t) 


sin  co0t  dr 


If  after  the  elapse  of  a time  interval  t1  the  action  of  the  force  f{t) 
ceases,  then  the  integrals  in  this  formula  will  cease  to  depend  on  the 
time  for  t > tx  and  the  solution  becomes 

y{t)  — a cos  o0t  + b sin  u>0t  if  t > L 

where 

a = \ /( t)  sin  co0t  dr,  b = \ f(r)  cos  co0t  dr 

ma0  j moi0  J 

^0  ^0 

Thus,  if  originally  a body  is  at  rest  and  then  for  a time  is  acted  upon 
by  an  external  force  f{t),  then  when  the  force  ceases  to  act  the  body 
will  perform  natural  oscillations  with  a frequency  to0  and  an  ampli- 
tude ]/a2  + b2. 

Formula  (54)  can  be  written  differently  if  we  introduce  a “com- 
plex deviation  from  the  equilibrium  position”: 

t t 

w(t)  = — - — f y(x)  dr  = — - — C^“IWoT/(t)  dr  • £lco°* 

mco0  J wco0  j 

to 

for  which  the  real  deviation  y(t)  serves  as  the  imaginary  part.  If  t0 
is  the  initial  time  of  action  of  the  force,  then  the  formula  can  be  rewrit- 
ten as 


]){t)  = — — C ^~1C0°T/(T)  * < 

moiQ  J 


because  under  this  transformation  a part  equal  to  zero  is  added  to 
the  integral.  If  the  force  acts  only  up  to  a certain  time  tv  then,  from 
that  time  onwards,  we  can  write 


')(t)  — — - — f e~ib3°Tf(r)  dr  • e{ 
wto0  J 


(this  integral  is  actually  extended  over  the  interval  from  t = t0  to 
/ = ^).  The  result  is  a harmonic  oscillation  with  frequency  <o0  and 
amplitude 


— f e~ia>’ /( t)  di 

0 J 


m<x>. 
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Integrals  like  these  (they  are  called  Fourier  integrals)  will  be  consi- 
dered in  Ch.  14. 

If  the  driving  force  is  of  the  form  /(/)  = A sin  a )/,  then  we  can 
find  a particular  solution  of  the  equation  by  formula  (52),  which  is 
simplified  in  the  absence  of  friction  (i.e.  for  h = 0)  and  assumes  the 
form 

A.  A. 

y = - sin  co/  = sin  co/  (56) 

T.  9 9\  v • 


k — mw2 


w(co2  — o)2) 


Superimposed  on  this  oscillation,  which  occurs  with  the  frequency 
co  of  the  driving  force,  is  a natural  oscillation  with  natural  frequency 
co0,  which  depends  on  the  initial  conditions  and  does  not  decay  in 
the  absence  of  friction. 

Of  particular  interest  is  the  case  where  a sinusoidal  external  force 
acts  on  an  oscillator  without  friction  under  zero  initial  conditions.  The 
solution  is  then  found  from  formula  (54)  and,  assuming  /0  = 0 for 
the  sake  of  simplicity,  we  get 

t 

t)  A sin  cot  dx 


y 


— [ sin  co0(/ 

W0  J 
o 

t 

i f 1 

\ — [cos  (cot  — co0(/  — t))  — cos  (cot  + co0 (/  — t))]  dx 

co0  J 2 
0 

’L 


A 

[ sin  (cot  - 

- - T)) 

sin  (cor  + co0(£ 

2wco0 

[ « 

+ COQ 

© 

3 

1 

3 

i 

A 

sin  c o£ 

sin  c at 

sin  (-  6 y)  , 

2mco0 

co  -j-  co0 

CO  — COq 

~r ~ 

CO  + COq  ’ 

sin  co0£  " 
0 - co0. 


mco0(co  + co0)  (co  — co0) 


(co  sin  co0/  — co0  sin  co /) 


(57) 


Let  the  frequency  co  of  the  driving  force  be  close  to  the  natural 
frequency  co0  of  the  oscillator.  Transforming  the  right  side  of  (57)  by 
the  formula 


y = 


wco0(co  + co0)  (co  — co0) 
A 


[co0  sin  co0/  — co0  sin  co/  + (co  — co0)  sin  co0/] 
A 


co  + C00)  (co  - co0) 
2 A 


(sin  co0/  — sin  co/)  + 


wco0(co  -f  co0] 


sin  co0/ 


to0  -J-  CO 


/ + 


m( co  + co0)  (co  — co0)  2 2 " raco0(co  + co0) 

and  replacing  (approximately)  co0  + co  by  2co0,  we  get 

— A co  — con  , . » A 


sin  co0/ 


y 


WC0Q(C0  - COq) 


sin- 


• / COS  COq / -j- 


2wco§ 


sin  co0/ 
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The  most  interesting  thing  is  the  first,  principal,  term.  It  can 
be  written  in  the  form 

M ( t ) cos  (o0(,  where  M (()  = — sin  — — — t 

m co0(co  — (o0)  2 

and  interpreted  as  a harmonic  oscillation  having  frequency  and  a 
slowly  varying  amplitude.  This  amplitude  varies  from  0 to 

M0  = max  | M (t)  \ = - (58) 

WCO0  | Ca>  — CO0  j 

with  period 


| CO  - 6)0  I 

Oscillations  like  these  are  termed  beats  (the  graph  is  shown  in  Fig.  85). 
They  result  from  the  interference  of  a forced  frequency  (56)  and  a 
natural  frequency  due  to  their  being  close  in  amplitude  and  frequency 
(see  formula  (57)).  We  see  that  both  the  build-up  time  for  beats  and 
the  amplitude  of  the  beats  are  inversely  proportional  to  | — o>0  |, 

which  is  the  difference  between  the  frequency  of  natural  oscillations 
and  that  of  the  driving  force. 

If  the  oscillator  has  a low  friction,  then  the  oscillation  process 
will  also  begin  with  beats,  but  after  a sufficient  time  the  natural 
oscillation  decays  and  only  the  constrained  oscillation  (52)  is  left. 
Its  amplitude  is  equal,  for  very  small  h , to 

V(k  - mco2)2  + A2w2  = = 

(ft  — wo2)2  -f  A2co2  ^ (k  — wco2)2  + h2a2 

A A A A 

« = « 

| k — mu>2  | m | co2  — co2  | m | co0  — to  | (co0  + to)  2 wto0(to  — to0) 

Comparing  this  with  formula  (58),  we  see  that  the  amplitude  of  the 
constrained  oscillations  is  equal  to  half  the  amplitude  of  the  beats. 
The  oscillation  curve  thus  has  the  form  shown  in  Fig.  86.  The  time 
interval  during  which  the  beats  turn  into  purely  harmonic  oscillations 
is  the  transient  period,  and  the  process  itself  is  called  a transient 
process. 
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If  the  oscillator  is  without  friction,  then  in  the  special  case  of 
6)  = <o0,  that  is,  the  frequency  of  the  driving  force  coincides  with 
the  natural  frequency,  formula  (56)  is  inapplicable.  Recall  that  this 
is  precisely  the  case  that  we  skipped  at  the  end  of  Sec.  5.5.  Let  us 
take  advantage  of  the  general  formula  (54)  and  assume  for  the  sake 
of  simplicity  tQ  — 0: 

t 

v = - — • l sin  co0(/  — t)  • A sin  co0-  dr 

m ' 
o 


= ~\  rc°s  <»o(t  — 2t) 

o 


— cos  <Vj  dr 


A T sin  co0(/  — 2t) 
2m  [ — 2co0 


T COS  O)0 


i 


0 


A 

2mco0 


sin  co0 


t COS  COn  t 

2m 


The  first  of  the  terms  is  a natural  harmonic  oscillation  and  is  present 
solely  to  satisfy  the  zero  boundary  conditions.  In  contrast,  the  second 
term  is  an  oscillation  whose  amplitude  tends  to  infinity  linearly  with 
time.  Therein  lies  the  very  important  phenomenon  of  resonance, 
which  results  when  the  frequency  of  the  driving  force  coincides  with 
the  natural  frequency  of  the  system. 

Exercises 

Find  solutions  of  the  following  differential  equations  that  satisfy 
the  indicated  initial  data. 

X.  y"  — y — y — 0t  yr  — 0 for  t — 0. 

2.  y”  + y — t,  y — 1,  y " = 0 for  t = 0. 
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7.6  Stable  and  unstable  solutions 

We  begin  with  the  simplest  equation 

~ = ay  (a  = constant)  (59) 

with  the  variable  t interpreted  as  the  time.  It  is  easy  to  obtain  the 
general  solution: 

y = Ceat  (60) 

where  C is  an  arbitrary  constant  determined  from  the  initial  condition 

y(*  o)  = y0 

Substituting  into  (60),  we  get 

y0  = Ceai\  or  C = y0e~at* 

and  finally 

y = y0ea^-{  o)  (61) 

In  particular,  when  yQ  = 0 we  get  the  zero  solution  y = 0. 
Now  suppose  that  the  initial  value  y0,  which  we  regarded  as  zero, 
is  actually  different  from  zero,  albeit  just  slightly.  Then  how  will 
the  solution  behave  with  time,  that  is,  as  t increases?  Will  such  a 
perturbed  solution  approach  the  unperturbed  zero  solution  or  will 
it  recede  from  it? 

The  answer  to  these  questions  depends  essentially  on  the  sign 
of  the  coefficient  a.  If  a < 0,  then  (61)  shows  immediately  that  as  t 
increases  the  solutions  approach  zero  without  limit,  so  that  for 
large  t they  practically  vanish.  In  such  a situation,  the  unperturbed 
solution  is  said  to  be  asymptotically  stable  relative  to  the  change  (per- 
turbation) of  the  initial  condition  or  asymptotically  stable  in  the  sense 
of  Lyapunov  (after  the  celebrated  Russian  mathematician  A.  M. Lya- 
punov, 1857—1918),  who  was  the  first  to  begin  a systematic  study 
of  the  concept  of  stability  of  processes. 

The  picture  will  be  quite  different  for  a > 0.  Here,  when  y0  0 
and  t is  increasing,  the  solution  increases  in  absolute  value  without 
bound,  that  is  to  say,  it  becomes  significant  even  if  y0  was  arbitrarily 
small.  Here  the  unperturbed  solution  is  said  to  be  unstable.  For  a > 0 
we  have  equation  (59),  for  example,  when  considering  the  growth 
of  bacteria  in  a nutrient  medium  with  y denoting  the  mass  of  bacteria 
per  unit  volume  and  a , the  intensity  of  growth.  It  is  clear  that  if  at 
the  initial  time  there  were  no  bacteria  at  all,  then  of  course  none 
will  appear  in  the  course  of  time.  But  this  picture  is  unstable  in  the 
sense  that  a purposeful  or  accidental  introduction  of  any  arbitrarily 
small  amount  of  bacteria  into  the  medium  will,  in  time,  lead  to  ex- 
treme growth  and  pollution  of  the  medium  with  bacteria. 
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Fig.  87 


The  intermediate  case  a = 0 is  interesting.  Here  the  solutions 
will  merely  be  constant  and  for  this  reason  for  a small  initial  deviation 
of  the  perturbed  solution  from  the  unperturbed  solution  the  former 
will  be  close  to  the  latter  even  when  t increases,  although  the  approach 
will  not  be  asymptotic  (as  t ->  oo).  Such  a picture  is  called  nonasymp- 
totic  stability  of  an  unperturbed  solution. 

Now  let  us  consider  the  more  general  equation 

f =f(y)  (62) 

at 

It  is  easy  to  find  all  the  stationary  solutions,  that  is,  solutions  of  the 
form  y = constant.  To  do  this,  in  (62)  set  y = y = constant  to  get 

f(y)  = 0 (63) 

Thus  the  stationary  solutions  of  (62)  are  the  zeros  of  the  function 
f(y)  in  the  right-hand  member.  Let  us  examine  one  such  solution 
y = y and  determine  whether  it  is  stable  or  not. 

First  assume  that  the  function  f(y)  is  decreasing,  at  least  in  a 
certain  neighbourhood  of  the  value  y = y ; then  if  y passes  through 
the  value  yt  it  follows  that  f(y)  passes  from  positive  values  to  negative 
values.  In  this  case  the  approximate  picture  of  the  direction  field 
defined  by  equation  (62)  (cf.  Sec.  7.1)  is  shown  in  Fig.  87.  In  construct- 
ing this  field,  bear  in  mind  that  the  isoclines  for  (62)  have  the  form 
y = constant  (why?),  that  is,  they  are  straight  lines  parallel  to  the 
i-axis  (they  are  also  shown  in  Fig.  87).  In  Fig.  87  the  heavy  lines  indi- 
cate the  integral  straight  line  y = y , which  depicts  an  unperturbed 
stationary  solution,  and  several  other  integral  curves  that  result 
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Fig.  88 


from  changes  in  the  initial  condition.  It  is  clear  that  if  yQ  differs  but 
slightly  from  y (say,  within  the  limits  of  the  drawing),  then  the  per- 
turbed solution  does  not  differ  appreciably  from  the  unperturbed  one 
even  when  t increases,  and  when  t ->  oo  it  asymptotically  approaches 
the  unperturbed  solution.  Thus,  in  this  case  the  unperturbed  solution 
is  asymptotically  stable. 

Now  let  f(y)  increase  from  negative  values  to  positive  ones  when 
y passes  through  the  value  y . The  appropriate  picture  is  given  in 
Fig.  88.  It  is  clear  that  no  matter  how  close  yQ  is  to  y (but  not  equal 
to  y !),  the  appropriate  solution  y(t)  will  recede  from  the  unperturbed 
solution  to  a finite  but  substantial  distance  as  t increases.  This  means 
that  in  the  case  at  hand  the  unperturbed  stationary  solution  is  unstable. 
(Check  to  see  that  the  earlier  obtained  criteria  for  stability  and  insta- 
bility for  equation  (59)  can  be  obtained  as  a consequence  of  general 
criteria  indicated  for  equation  (62).) 

The  criteria  obtained  here  can  be  derived  differently,  without 
resorting  to  a geometric  picture.  Expand  the  right  side  of  (62)  in  a 
power  series  about  the  value  y = y ; then  by  virtue  of  the  condition 
(63)  there  will  not  be  a constant  term  in  the  expansion  and  we  get 

% =f'(y)  (. y-y)  + - 

at 

That  is, 

^yx-=f'(y)(y-y)  + - (^) 

at 

where  the  dots  stand  for  higher-order  terms.  It  must  be  stressed 
that  when  determining  stability  in  the  sense  of  Lyapunov,  we  study 
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the  behaviour  of  perturbed  solutions  that  differ  but  slightly  from  the 
unperturbed  solution,  that  is  to  say,  we  consider  only  small  values 
of  y —y.  For  this  reason,  the  main  role  in  the  right-hand  member 
of  (64)  is  played  by  the  linear  term  that  is  written  out.  Discarding 
higher-order  terms,  we  obtain  an  equation  of  the  form  (59)  in  which 
a ~ f'{y).  Applying  the  results  obtained  above  for  equation  (59), 
we  see  that  for  f'(y)  < 0 the  solution  y — y = 0,  or  y ~ y,  is  asymp- 
totically stable ; but  if  f'(y)  > 0,  then  the  solution  y = y is  unstable. 
But  in  fhe  first  case  the  function  f(y)  decreases  (at  least  about  the 
value  y = y),  while  in  the  latter  case,  it  increases,  so  that  we  arrive 
at  the  same  conclusions  that  were  obtained  via  geometrical  reasoning. 
In  a particular  case  where  ff(y)~a  = 0 for  equation  (59)  we  have 
nonasymptotic  stability,  that  is,  the  solutions  close  to  the  unper- 
turbed solution  do  not  tend  to  it  and  do  not  recede  from  it ; then  in 
the  complete  equation  (64)  a substantial  role  is  played  by  higher- 
order  terms,  which  in  one  instance  can  direct  the  perturbed  solutions 
to  the  unperturbed  solution,  while  in  another  can  carry  them  away 
to  a substantial  distance.  We  will  not  discuss  this  particular  case. 

To  illustrate,  let  us  examine  the  thermal  regime  in  a certain 
volume  where  we  have  a chemical  reaction  associated  with  release 
of  heat,  which  is  carried  off  into  the  ambient  space.  Since  the  rate 
of  the  reaction  depends  on  the  temperature  T in  the  given  volume 
(we  consider  the  mean  temperature  at  a given  time  t),  the  rate  Q 
of  heat  release  in  the  reaction  depends  on  T,  Q = Q(T).  We  assume 
the  relationship  to  be  as  shown  in  Fig.  89.  We  also  suppose  the  rate 
of  heat  dissipation  into  the  ambient  space  to  be  equal  to  a(T  — T0), 
where  a is  the  proportionality  constant  and  T0  is  the  temperature 
of  the  ambient  medium.  Then,  for  a constant  heat  capacity  c of 
the  volume  at  hand,  the  differential  equation  of  the  process  becomes 

=cdP  = Q(T)  - a(T  - T0)  (65) 

By  virtue  of  the  foregoing,  a stationary  state  (in  which  the  tempera- 
ture remains  constant  during  the  reaction)  is  possible  for  those  T 
for  which  the  right  side  vanishes,  which  means  that  the  graph  of 
Q{T)  crosses  the  graph  of  a(T  — T0)  (see  Fig.  89).  We  see  that  if 
the  ambient  temperature  T0  is  sufficiently  great  (when  T0  = T0 
in  Fig.  89),  a stationary  state  is  impossible:  the  supply  of  heat  will 
all  the  time  be  greater  than  dissipation  and  the  volume  will  constantly 

heat  up.  If  the  temperature  is  low  (when  T0  = T0  in  Fig.  89),  two 
stationary  states  having  temperatures  T±  and  T2  are  conceivable. 
Near  the  value  Tv  the  right  member  of  (65)  passes  from  positive 
values  to  negative  values,  which  means  it  decreases.  We  have  already 
seen  that  such  a state  is  a stable  state.  This  is  evident  from  Fig.  89, 
for  if  the  temperature  T falls  below  Tv  then  more  heat  will  be  released 
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in  the  reaction  than  is  dissipated,  which  means  the  volume  will  heat 
up,  and  if  T rises  above  Tv  then  more  heat  will  be  dissipated  than 
released  and  the  volume  will  cool  off.  In  similar  fashion  we  can  verify 
that  the  stationary  temperature  T2  will  be  unstable.  Thus,  when 

T0  = T0  the  development  of  the  process  depends  on  the  initial  temper- 
ature as  follows : if  it  was  less  than  T2,  then  in  time  the  temperature 
tends  to  the  stationary  value  7\ ; if  the  initial  temperature  was  grea- 
ter than  r2,  then  the  temperature  builds  up  catastrophically.  Such 
was  the  reasoning  that  served  as  the  basis  for  the  theory  of  thermal 
explosion  developed  by  Nobel  Laureate  N.  N.  Semenov  in  1927. 
Now  let  us  take  up  the  equation  of  free  oscillations: 

m^-  + h — + ky  = 0 (m,  h}  k > 0)  (21) 

dt*  dt 

with  the  general  solution 

y = + C#**  (66) 

where  p1  and  p2  are  roots  of  the  characteristic  equation  (24)  and  Cx 
and  C2  are  arbitrary  constants  determined  from  the  initial  conditions. 
This  equation  has  the  stationary  solution  y = 0.  In  Sec.  7.3  we  saw 
that  all  other  solutions  tend  to  zero  (in  an  oscillatory  or  nonoscillatory 
manner)  as  t increases ; thus  the  indicated  stationary  solution  is  asymp- 
totically stable.  In  the  absence  of  friction,  that  is  when  h = 0,  we 
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saw  that  the  solutions  are  periodic.  For  this  reason,  the  solution  will 
be  small  even  when  t increases,  given  a small  initial  deviation  and  a 
small  initial  velocity,  but  it  will  not  tend  to  zero.  Hence  in  the  absence 
of  friction  the  stationary  solution  will  be  stable  but  not  asymptoti- 
cally stable. 

Using  specially  chosen  schemes,  one  can  construct  systems  with 
one  degree  of  freedom  described  by  equation  (21)  (where  y is  the  devia- 
tion of  the  system  from  the  stationary  state)  for  which  h < 0 or 
k KO.  Such  systems  may  be  interpreted  as  systems  with  negative 
friction  or  with  negative  elasticity.  (See,  for  example,  the  description 
of  the  operation  of  a tunnel  diode  given  in  HM,  Sec.  8.16;  in  this 
mode  of  operation,  the  system  may  be  interpreted  as  an  oscillator 
with  negative  friction.)  It  can  be  readily  verified  that  in  all  such  sys- 
tems the  stationary  solution  y = 0 is  unstable.  Indeed,  from  algebra  we 
know  the  properties  of  the  roots ^>1and^>2  of  the  quadratic  equation  (24) : 

Pl+P*=-~.  P1P2  = - 

m m 

From  the  first  equation  it  is  seen  that  if  h < 0,  then  either  at  least 
one  root  is  positive  and  real  or  the  roots  are  conjugate  imaginaries 
with  positive  real  part.  From  the  second  equation  we  see  that  if  k < 0, 
then  the  roots  are  of  different  sign  and  for  this  reason  there  is  one  posi- 
tive root.  Thus  in  all  cases  there  is  at  least  one  root  that  is  positive 
and  real  or  imaginary  with  a positive  real  part.  Let  px  be  that  root. 
Then  the  first  term  on  the  right  of  (66)  is  of  the  form 

C1e^t{pl  > 0)  or  C1eb’_Mt0>*  = C^^cos  + i sin  at)  (y  > 0) 

and  therefore,  for  arbitrarily  small  Cx  (which  is  to  say,  for  arbitrarily 
small  initial  data),  it  tends  to  infinity  as  t increases.  This  is  what 
signifies  the  instability  of  a stationary  solution. 

Exercise 

Find  the  stationary  solutions  of  the  equation  — = y3  — 4y 
and  determine  their  stability. 

ANSWERS  AND  SOLUTIONS 

Sec.  7.1 

1.  The  circles  x2  + y2  = C with  centre  at  the  coordinate  origin. 

2.  At  points  of  inflection  it  must  be  true  that  y"  — 0.  By  virtue 
of  equation  (1)  and  the  formula  for  the  derivative  of  a compo- 
site function  (Sec.  4.1),  we  get  y"  = (/(*>  y)V  =/*(*,  y)  + 
-f-  f'y(x’  y ) y'  — /*  + f'yf-  Hence  the  desired  equation  is  of 

the  form  f'x  + /'/  = 0.  For  the  equation  — = x2  + y2  we  get 

dx 

x + y (*2  + y2)  = o. 
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Sec.  7.2 

1.  y = 

2.  y = *. 

3.  y — In  (x  + e). 

4.  jy  = ]/ 1 -f  %2. 

Sec.  7.3 

1.  y = 2 cos  t. 

2.  y — — sin  2 1. 

J 4 

3.  jy  = e2t  -f-  e\ 

Sec.  7.4 

1.  # = —3{t  — 2)  + 1 = 

= — 3/  + 7. 

/2 

2.  # = : 2 

2 

Sec.  7.5 

1-  ^ = y(«‘+0—  !• 

2.  jy  = t -f-  cos  t — sin  t. 

Sec.  7.6 

^ 2 (unstable),  jy  = 0 


6.  y = 2x  — l — e '2X. 

7.  y = e~x  + ~ (sin  x + cos  x) . 

4.  3/  = ^ + 3e~l- 

5.  j'  = (t  — 1)  e\ 

6.  y = (5t+  1)  e~2t. 

3.  x — It  — sin 

t 

4.  A'  = ( (<  — t)  c-d-  = e'. 


(stable). 


Chapter  8 

DIFFERENTIAL  EQUATIONS  CONTINUED 


This  chapter  is  a direct  continuation  of  the 
preceding  one,  which  serves  as  an  essential 
foundation. 

8.1  Singular  points 

In  Sec.  7.1  we  established  that  the  integral 

curves  of  the  equation  — = f(x,  v)  do  not  in- 

dx 

tersect.  However  there  is  an  important  excep- 
tion to  this  rule.  It  may  indeed  happen  that 
for  certain  values  of  x and  y the  function  f(x,  y) 
does  not  have  a definite  value.  For  example, 

/(: x,  y)  = — does  not  have  a definite  value  for  x = 0,  y = 0. 

X 

The  point  in  the  plane  at  which  f(x,  y)  becomes  meaningless  is  called 
a singular  point  of  the  differential  equation  — = f(x,  y).  Several  in- 

dx 

tegral  curves  can  pass  through  a singular  point. 

If  f(x,  y)  has  the  form  of  a ratio  of  two  functions  of  a simple 

type  (say,  two  polynomials),  ]{xt  v)  = ^x'  — , then  the  coordinates 

y) 

of  the  singular  point  are  found  from  the  system  of  equations 

?(*,  y)  = 0,) 

y)  = 0 i 

Let  us  examine  several  examples  of  singular  points  that  belong  to 
the  types  most  frequently  encountered  in  applications. 

1.  — = — . The  solution  of  this  equation  is  the  function 

dx  x 

y ~ Cx  for  any  constant  C.  The  collection  of  integral  curves  are  all 
straight  lines  passing  through  the  origin  (Fig.  90).  Thus  the  integral 
curves  intersect  at  the  singular  point  x = 0,  y = 0. 

2 ^ The  solution  is  y = Cx2.  The  integral  curves  are 

dx  x 

parabolas  with  vertex  at  the  origin.  Here,  all  the  integral  curves 
are  tangent  at  the  singular  point  x = 0,  y = 0 (Fig.  91). 
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Fig.  90 


Fig.  91 
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Sec.  8.2  Systems  of  differential  equations 

In  the  foregoing  examples,  all  the  integral  curves  pass  through 
the  singular  point  and  have  a definite  direction  there.  Such  a singular 
point  is  called  a nodal  point. 

3.  There  are  also  singular  points  near  which  integral  curves 
behave  differently.  Let  — = — — . The  integral  curves  have  the 

• dx  x 

equation  xy  = C.  For  C = 0 we  get  x — 0 or  y = 0,  which  are  two 
straight  lines  passing  through  the  origin.  For  C =£  0,  the  integral 
curves  are  hyperbolas.  Thus,  two  integral  curves  pass  through  the 
singular  point  x ~ 0,  y — 0,  and  the  others  do  not  pass  through  it. 
A singular  point  of  this  type  is  called  a saddle  point  (Fig.  92). 

4.  The  integral  curves  of  the  equation  — = — — are  the 

dx  y 

circles  x2  + y2  = C . Here  the  integral  curves  enclose  the  singular 
point  but  not  a single  integral  curve  passes  through  it  (Fig.  93). 
A singular  point  of  this  type  is  called  a vortex  point. 

5.  When  integrating  the  equation  — = 96  + y with  singular 

dx  x — y 

point  at  the  origin,  it  is  convenient  to  pass  to  polar  coordinates  p, 
9 via  the  formulas  x = p cos  9,  y = p sin  9,  whence 

dx  = cos  <p'  dp  — p sin  9 dtp,  dy  = sin  9 • dp  + p cos  9 dtp 

After  substituting  into  the  equation  and  collecting  terms,  we 

get  dp  = p dtp  (verify  this),  whence  — = dtp  and  p = Ce*.  Assigning 

P 

all  possible  values  to  C,  we  get  a family  of  spirals  closing  in  on  the 
origin  (Fig.  94).  A singular  point  of  this  type  is  called  a focal  point. 

In  the  foregoing  examples  it  is  easy  to  establish  the  type  of 
behaviour  of  integral  curves  near  the  singular  point,  since  it  is  easy 
to  solve  the  differential  equations.  In  more  complicated  cases  one 
can  get  a rough  idea  of  the  nature  of  the  singular  point  by  drawing 
isoclines.  More  effective  procedures  for  investigating  singular  points 
lie  outside  the  scope  of  this  book.  Application  of  these  procedures 
shows,  in  particular,  that  if  the  condition  9'^y  =/=  9y<J4  holds,  the 
singular  point  must  definitely  belong  to  one  of  the  foregoing  types. 

Exercise 

Determine  the  nature  of  the  singular  point  for  the  equations 

dy  x dy  x + 2 y dy  __  2x  -f-  y 

dx  y dx  x dx  x + 2y 

8.2  Systems  of  differential  equations 

So  far  we  have  assumed  that  there  is  one  differential  equation 
from  which  it  is  necessary  to  find  one  desired  function.  But  there 
may  be  more  than  one  unknown  function,  say  two:  y(x ) and  z(x). 
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Fig.  92 


Fig.  93 


X 
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Then  there  must  be  two  differential  equations  as  well.  If  these  are 
of  the  first  order  (solved  for  the  derivatives  of  the  desired  functions), 
then  they  have  the  general  form 

Y =/(*.  y.  z)> 

(!) 

Y = ?(*.  y>  ~~) 

dx 

We  thus  obtain  a system  of  first-order  differential  equations. 

It  is  easy  to  pass  from  one  equation  of  the  second  order, 

(2) 

in  one  unknown  function  to  an  equivalent  system  of  two  first-order 
equations  with  two  unknown  functions.  For  this  we  have  to  regard 

“ as  the  additional  unknown  function.  Denoting  it  by  z,  we  get 

dy % d2y d (dy' | dz 

dx  3 dx2  dx\dx  I dx 

And  so  instead  of  (2)  we  can  write  the  equivalent  system 


y = F(*>  y> z ) 

dx 
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Similarly,  an  equation  of  the  third  order, 

*y=Q(X,y,  d-T,  **) 
dx 3 \ dx  dx2  J 


x>  y, 


can  be  replaced  by  an  equivalent  system  of  three  equations  of  the 
first  order.  To  do  this,  set 

dy  % d2y  dz  ^ 

dx  ’ dx2  dx 


to  get  the  system 


dl-z 

dx  ’ 
dz 

- = M, 

dx 

— = <b(x,  y,  z,  u) 
dx 


In  similar  fashion  it  is  possible  to  go  from  an  equation  of  any 
order  or  even  a system  of  equations  of  any  order  to  a system  of  the 
first  order.  Conversely,  it  can  be  verified  that  it  is  possible  to  pass 
from  a system  of  n equations  of  the  first  order  with  n unknown  func- 
tions to  one  equation  of  order  n with  one  unknown  function.  For  this 
reason,  the  general  solution  of  such  a system  is  obtained  as  the  result 
of  n integrations  and,  thus,  contains  n arbitrary  constants.  The  spe- 
cification of  n initial  conditions  (for  some  value  of  % we  specify  the 
values  of  all  desired  functions)  is  just  sufficient  to  find  the  values 
of  these  constants  and,  thus,  to  obtain  a particular  solution. 

For  the  sake  of  simplicity,  let  us  consider  a system  of  two  first- 
order  equations  of  type  (1).  It  is  sometimes  possible,  even  without 
solving  the  system,  to  find  a relation  connecting  the  components 
(that  is,  y(x)  and  z(x))  of  any  particular  solution.  Such  a relation  is 
of  the  form 

H(x,  y,  z\  C)  = 0 (3) 


(here,  C is  a constant  that  varies  from  solution  to  solution),  and  is 
called  the  first  integral  of  the  system  of  equations.  A knowledge  of  the 
first  integral  makes  it  possible  to  lower  the  number  of  equations  in 
the  system  by  one,  that  is,  to  pass  to  one  equation  of  the  first  order 
with  one  unknown  function.  We  can  use  (3)  and  express  z in  terms 
of  the  rest  and  substitute  the  result  into  the  first  equation  of  (1)  to 
get  one  first-order  equation  with  one  desired  function  y(x).  If  this 
equation  is  integrated  and  we  find  y{x),  then  z(x)  may  be  found  from 
(3)  without  integrations. 
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Similarly,  a knowledge  of  two  independent  first  integrals  (that 
is,  such  that  neither  is  a consequence  of  the  other), 

Hx(xt  y,  z;  CJ  = 0,  1 

H2(x>  y>  z‘>  Cg)  = 0 J 

produces  the  general  solution  of  system  (1)  written  in  implicit  form. 
For  a system  of  n first-order  equations  the  general  solution  is  obtained 
from  n independent  first  integrals. 

In  certain  cases  the  first  integrals  are  suggested  by  physical 
reasoning,  most  often  by  conservation  laws.  For  example,  let  us  write 
down  the  equation  of  one-dimensional  elastic  linear  oscillations  with' 
out  friction  (see  Sec.  7.3), 

m — + kx  = 0 
at* 

in  the  form  of  a system  of  the  first  order: 

dx  dv  L 

— ==  v , M — = — kx  (4) 

dt  dt  w 


In  Sec.  7.3  we  already  mentioned  the  expression 
of  an  oscillating  point: 


E = 


mv2  kx 2 
2*2 


for  the  total  energy 


(5) 


The  energy  should  be  conserved  in  the  case  of  free  oscillations  without 
friction.  Sure  enough,  by  virtue  of  (4), 

— = mv  — + kx  — = — kxv  -f-  kxv  = 0 

dt  dt  dt 

(this  is  a mathematical  demonstration  of  the  law  of  conservation  of 
energy  in  the  given  instance).  Thus,  E = constant  for  any  solution 
of  system  (4),  which  means  that  formula  (5),  in  which  E plays  the 
part  of  the  arbitrary  constant  C,  serves  as  the  first  integral  of  the 
system  (4). 

As  is  evident  from  the  foregoing,  it  is  most  natural  to  consider 
systems  in  which  the  number  of  equations  is  equal  to  the  number 
of  unknown  functions;  such  systems  are  called  closed  systems.  If 
there  are  fewer  equations  than  desired  functions,  the  system  is  said 
to  be  open  (underdetermined) ; in  most  cases  the  openness  of  the  system 
indicates  merely  that  not  all  the  necessary  relations  have  been  writ' 
ten  down.  If  there  are  more  equations  than  unknown  functions,  then 
the  system  is  said  to  be  overdetermined ; overdetermination  of  a sys- 
tem ordinarily  indicates  that  it  is  dependent  (i.e.  that  some  of  the 
equations  are  consequences  of  others  and  therefore  superfluous)  or 
that  a mistake  has  been  made  in  setting  up  the  system. 
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Consider  the  system  of  equations 

dy  , dz 

■f  = y + z’  t~  — 

dx  dx 


—y  + z 


Multiply  the  first  equation  by  y and  the  second  by  £ and  then  add 
the  results  to  find  the  first  integral  of  the  system.  What  conclusions 
can  be  drawn  from  this  concerning  the  behaviour  of  particular  solu- 
tions as  x ->  ± oo  ? 

8.3  Determinants  and  the  solution  of  linear  systems  with  constant 
coefficients 

First  let  us  examine  the  notion  of  a determinant,  for  it  will  play 
an  important  role  in  the  solution  and  investigation  of  systems  of 
linear  equations  of  various  kinds.  We  begin  with  a system  of  two 
algebraic  equations  of  the  first  degree  in  two  unknowns: 

«i*  + biy  = di » 1 (6) 

a2x  -f-  b2y  = d2  f 

Solving  it  (we  leave  this’to/the  reader),  we  get 


#=  — — > ^ — (/) 

afi2  — bxa2  afi2  — bxa2 

The  expression  a-fi2  — b1a2  is  called  a determinant  of  the  second  order 
and  is  written  as 

0 A — bla3  — I fllf1 1 (8) 


where  the  vertical  lines  are  the  sign  of  the  determinant.  Using  this 
notation,  we  can  write  (7)  as 

I ! I a idi  I 


*Aj 

d2b2 1 

<zlbl 

a2b2 

To  illustrate,  let  us  evaluate  a determinant: 

1°  — 3 1=  0- 1 -(— 3)*  2 = 0 + 6 = 6 


A similar  solution  of  the  system  of  equations 

0i*  + bi  y + ciz  = di> ' 

a2x  + b2y  + c2z  = d2,  • 

«3*  + b3J  + C3Z  = d3  ■ 
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% = 


d 1 

Cl 

*i 

C1 

a 1 

bi 

d 1 

b2 

02 

a2 

d* 

c2 

a2 

b2 

d2 

d3 

b 3 

C3 

a3 

d3 

C3 

a3 

bz 

d3 

<*  1 

bt 

C1 

'»  y — ' 

ai 

b 1 

Cl 

> Z “ 

ai 

bl 

Cl 

a2 

b2 

c2 

a 2 

b* 

C2 

Cl  2 

b2 

C2 

a3 

b3 

C3 

a3 

b 3 

C3 

a3 

bz 

C3 

(11) 


where  we  have  third-order  determinants  in  the  numerators  and 
denominators ; they  are  computed  from  the  formula 

a-^  b ^ Cj 

CLn  bn  Co 


I a3  b3 


— af)2C3  byCL2C3  — f~  b]C2a3  — j j-  CyCtf)3  — ef)na 3 (12) 


Formulas  (11)  are  completely  analogous  to  formulas  (9).  In  the 
denominator  we  have  one  and  the  same  determinant  made  up  of 
the  coefficients  of  the  unknowns  (the  so-called  determinant  of  the 
system).  In  the  numerator,  for  each  of  the  unknowns  we  have  a deter- 
minant obtained  from  the  determinant  of  the  system  by  substituting 
the  column  of  constant  terms  for  the  column  of  coefficients  of  the 
given  unknown. 

A determinant  of  the  third  order  can  be  expressed  in  terms  of 
determinants  of  the  second  order.  To  do  this,  transform  the  expression 
(12)  and  take  advantage  of  (8): 


a-^  Cj 

^2  ^2  C2 
an  bn  Co 


— ai{boC3  ^2^3)  ^l(^2C3  ^2^3)  “1“  ^l(^2^3  * 


— a 1 


^3  C3 


-6l 


an  Co 


+ cx 


a2  b2  \ 
a3  b3  | 


(13) 


This  formula  can  be  used  to  compute  the  value  of  the  determinant. 
For  example. 


1 0 —2 

1 1 1 

= 1 

1 ± 
2 

— 0 

-1  i 
2 

2 

3 1 2 

1 2 

3 2 

+ (-  2) 


— 1 


= 1(1-2  — -i-  1)  — 2 (—  1-  1 — 1-  3) 

=1+8=9! 

2 2 
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It  turns  out  that  formulas  like  (9)  and  (11)  hold  true  for  systems 
consisting  of  any  number  of  equations  of  the  first  degree  if  the 
number  of  unknowns  is  equal  to  the  number  of  equations.  Determi- 
nants of  the  fourth  order  are  computed  by  analogy  with  (13): 


fll 

h 

ci 

d i 

u2 

C2 

d2 

a3 

^3 

CZ 

a\ 

h 

d. 

b2  c 2 d2 

Cl  2 C2  d2 

ii 

a 

M 

&3  d2 

-*i 

Cg  d% 

c ^ dt | 

c ^ d ^ 

1 

&2  b2  d2 

&2  b2  ^2 

+ C1 

CI3  b2  d% 

~d  1 

Cl  3 ^3  Co 

& ^ b ^ d ^ 

Cl  4 by | C 4 

Pay  special  attention  to  the  structure  of  the  expression  on  the  right. 
Determinants  of  order  five  are  determined  in  terms  of  determinants 
of  the  fourth  order,  and  so  on. 

In  formulas  (9)  and  (11)  it  is  assumed  that  the  determinant 
of  the  system  in  the  denominators  is  not  equal  to  zero.  In  this  case, 
the  system  (6)  itself  (and  respectively,  (10))  has  a unique  solution, 
which  is  understood  as  the  solution  set,  or  the  set  of  values  of  all 
the  unknowns.  Occasionally,  one  encounters  systems  with  a deter- 
minant equal  to  zero ; their  properties  are  quite  different. 

For  the  sake  of  simplicity,  consider  system  (6).  If  its  determinant 
is  zero, 


^2 
Cl2  ^2 


• d-^b2  b]&2  — 0 


then 


Ulb2  = v2.  — = - 
#2  ^2 

which  means  the  left  members  of  the  system  are  proportional.  For 
instance,  a system  can  have  the  form 


2x  + 3 y = dv  1 
8x  + \2y  = d2  ] 


(H) 


It  is  clear  that  if  the  right-hand  sides  are  chosen  arbitrarily, 
the  system  is  inconsistent,  or  contradictory  (it  does  not  have  a single 
solution).  And  only  if  the  right  members  are  in  the  same  proportion 
as  the  left  members  (in  this  example,  d2  = Id^,  then  one  of  the 
equations  is  a consequence  of  the  other  and  so  can  be  dropped.  But 
we  then  get  one  equation  in  two  unknowns  of  the  type 


2x  -f  3 y = dx 


Sec.  8.3  Determinants  and  solution  of  systems  273 

which  has  an  infinity  of  solutions:  we  can  assign  any  value  to  x 
and  find  the  corresponding  value  of  y. 

It  appears  that  this  situation  is  typical.  Namely,  that  if  the 
determinant  of  a system  is  zero,  then  a definite  relationship  exists 
between  the  left  members  of  the  system.  If  that  relation  holds  true 
for  the  right  members,  then  the  system  has  an  infinity  of  solutions ; 
otherwise  there  is  not  a single  solution. 

An  important  particular  case  is  a system  of  n linear  homogeneous 
(without  constant  terms,  that  is)  equations  in  n unknowns.  For 
example,  for  n = 3 the  system  is  of  the  form 


aix  + y + ciz 

a2x  + b2y  + c2z 

+ hy  + 

Such  a system  of  course  has  the  zero  (or  trivial,  hence  uninterestingr 
solution  x = y = z = 0.  It  is  often  important  to  determine  whethe- 
there  are  any  other  (nontrivial)  solutions.  The  foregoing  gives  the  ans, 
wer  immediately.  If  the  determinant  of  the  system  is  not  equal  to  zero) 
then  there  is  only  one  solution  and,  hence,  there  are  no  nontrivial 
solutions.  But  if  it  is  equal  to  zero,  then  the  system  has  an  infinity 
of  nontrivial  solutions,  since  it  cannot  be  inconsistent  in  this  case. 
To  find  these  solutions,  we  discard  one  of  the  equations  of  the  system, 
as  was  done  with  respect  to  system  (14). 

Now  let  us  apply  the  results  obtained  to  the  solution  of  a 
system  of  homogeneous  linear  differential  equations  with  constant 
coefficients. 

Consider  the  system 


dy 

dx 


= + byZ, 


dz 

dx 


(15) 


in  which  all  coefficients  av  bv  a2 , b2  are  constants.  The  particular 
solutions  are  sought  in  the  form 


where  X,  jjl,  p are  as  yet  unknown  constants.  Substitution  into  (15) 
yields,  after  cancelling  evx  and  transposing  all  terms  to  one  side, 


(«1-^)x  + 6,(i  = oq  /I7) 

<V'  + (^2  ~ P)  r1  = 0 J 

These  equations  may  be  regarded  as  a system  of  two  first-degree 
algebraic  homogeneous  equations  in  two  unknowns  X,  p.  For  it  to 
have  a nontrivial  solution  (and,  by  (16),  only  such  a solution  interests 
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us),  it  is  necessary  and  sufficient  that  lthe  determinant  of  the  system 
be  zero: 


ai  ~ P 

u2 


(18) 


This  is  the  characteristic  equation  of  the  system  (15),  from  which  we 
find  possible  values  for  p.  It  can  be  rewritten  by  expanding  the 
determinant: 


P2  ~~  (ai  + b2)  p + aj) 2 — bxa2  = 0 


(verify  this). 

We  see  that  equation  (18)  is  of  second  degree  in  p and  so  has 
two  roots  ply  p2.  If  these  roots  are  distinct,  then  either  one  (pk)  can 
be  substituted  into  (17)  to  find  a nontrivial  solution  XA,  and,  by 
(16),  to  obtain  the  corresponding  solution  y(x),  z(x)  of  the  system  (15). 
Since  the  quantities  XA,  \ik  are  determined  to  within  a common  arbi- 
trary factor,  we  multiply  by  an  arbitrary  constant  CA  the  solution 
corresponding  to  the  root  pv  and  by  C2  the  solution  corresponding 
to  the  root  p2,  and  then  add  the  results.  We  thus  obtain  the  general 
solution  to  the  system  (15): 

y = C1\1ePi*  + C2\2eP>*) 

z = C^e^*  + C2\i2ep2X) 

Here  the  arbitrary  constants  C±  and  C2  may  be  found,  for  example, 
if  we  also  give  the  initial  conditions  for  (15): 

for  x = x0  are  given  y = y0  and  z — z0 

If  the  roots  of  equation  (17)  are  imaginary,  then  the  solution 
can  either  be  left  in  the  form  (19)  or  be  written  in  real  form,  as  in 
Sec.  7.3,  If  px  ~ p2i  then  y and  z are  obtained  as  a combination  of 
the  functions  ep **  and  xe*'x  (cf.  Sec.  7.3). 


Exercises 

1.  Investigate  the  solvability  of  the  system  of  algebraic  equations 
x + 2y  = 


3*  + ay 


for  distinct  at  b. 


2. 


Find  the  general  solution  to  the  system  = Ax  — y,  ~ = 
= — 6x  + 3 y. 


8.4  Lyapunov  stability  of  the  equilibrium  state 

The  concept  of  stability  as  the  capability  of  an  object,  state  or 
process  to  resist  external  actions  not  indicated  beforehand  appeared 
in  antiquity  and  today  occupies  one  of  the  central  places  in  physics 
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and  engineering.  There  are  a variety  of  specific  realizations  of  this 
general  notion,  depending  on  the  type  of  object  under  study,  the 
nature  of  the  external  actions,  and  so  forth.  Here  we  consider  one 
of  the  most  important  stability  concepts:  stability  in  the  sense  of 
Lyapunov,  which  was  introduced  for  simple  cases  in  Sec.  7.6. 

Let  the  state  of  some  object  be  described  by  a finite  number  of 
parameters ; for  the  sake  of  simplicity,  we  take  two  parameters  x,  y, 
so  that  changes  in  this  entity  in  time  are  specified  by  two  functions 
x = x{t),  y = y(t),  where  t is  time.  Let  the  law  of  this  change  have 
the  form  of  a system  of  differential  equations: 


T,  “ p <*•  A 
f = (?(*.  r) 


(20) 


Mith  specified  right  members  that  do  not  involve  the  independent 
variable  t explicitly  .This  last  condition  signifies  that  the  differential 
law  of  development  of  the  process  does  not  change  with  time.  Such 
processes  are  called  autonomous  processes.  (Autonomous  systems 
occur,  for  example,  when  the  equations  involve  all  bodies  partici- 
pating in  the  problem,  since  the  laws  of  nature  are  invariable  in 
time.) 

Let  the  state  of  equilibrium  of  the  entity  at  hand  (when  it  does 
not  vary  in  the  course  of  time)  be  described  by  the  constant  values 
x = x0,  y ~ y0 ; then  this  system  of  constants,  regarded  as  functions 
of  time,  must  also  satisfy  the  system  (20).  From  a direct  substitution 
into  (20)  it  follows  that  for  this  it  is  necessary  and  sufficient  that, 
simultaneously,  we  have 


P(*0>  y o)  — 0,  Q{xo»  JVo)  — 0 (21) 


Suppose  at  a certain  time  t0  the  entity  is  brought  out  of  the 
equilibrium  state  (by  some  cause)  and  the  parameters  x , y become 
x = x0  + AxQ)  y = y0  + Ay0.  Then,  in  order  to  determine  sub- 
sequent changes  in  the  entity  we  have  to  solve  the  system  of 
equations  (20)  for  the  initial  conditions 

x{t0)  = *0  4-  A*o,  y{to)  = y0  + a3'o  (22) 

The  state  of  equilibrium  that  is  being  studied  is  said  to  be 
Lyapunov  stable  (stable  in  the  sense  of  Lyapunov)  if  after  a slight 
deviation  from  this  state  the  entity  continues  to  remain  near  it  all 
the  time.  In  other  words,  to  solve  the  system  (20)  under  the  initial 
conditions  (22)  for  small  Ax0 , Ay0,  the  differences  Ax  = x(t)  — x0, 
Ay  = y(t)  — y0  must  be  small  for  all  t > t0. 

In  order  to  determine  whether  stability  will  occur,  substitute 

x=  x0+  kx,  y = y0  + Ay 
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into  (20)  to  get 

= P(x o + Ax,  y0  + Ay)  = (P'x)0  Ax  + (P,'.)0  Ay  + ... 

(23) 

= Q(x o + A*.  v0  + Ay)  = (Q'x)0  Ax  + (Q'y)0  Ay  + ... 

at 

where  (P*)0  = P'x{xo>  Vo)*  and  so  forth.  In  manipulating  the  right 
members  we  take  advantage  of  Taylor's  formula  (Sec.  4.6)  and 
formulas  (21).  The  dots  stand  for  terms  higher  than  first  order. 

Since  in  determining  stability  we  consider  only  small  Ax,  Ay, 
the  main  part  in  the  right  members  of  (23)  is  played  by  the  linear 
terms  that  are  written  out  (cf.  similar  reasoning  in  Sec.  7.6).  For 
this  reason,  we  replace  the  system  (23)  by  an  abridged  system  (a 
system  of  first-order  approximation)  by  discarding  higher-order  terms: 


^p  = (P'x)0  Ax  + (p;,)0  Ay, 

at 

df£l  = (Q'x)0Ax  + (ffloA.v 

Cut 


(24) 


System  (24)  is  a linear  system  with  constant  coefficients  that 
is  solved  by  the  method  of  Sec.  8.3.  By  formula  (19)  (the  notation, 
however,  was  different  there)  the  solution  of  system  (24)  is  obtained 
as  a combination  of  functions  of  the  form  ept,  where  p satisfies  the 
characteristic  equation 


(P*)o  ~ P (P'y) 0 
m o (Q’y)o~P 


= 0 


(25) 


Here,  to  small  Ax0,  A y0  there  correspond  small  values  of  the 
arbitrary  constants  C\,  C2  and  so  the  whole  matter  lies  in  the  beha- 
viour of  the  function  ept  as  t increases.  Since  p can  be  imaginary 
as  well,  p — r + is,  and  then 

ePt  _ ^(cos  si  i sin  Si)  (26) 


it  follows  that  the  increase  or  decrease  of  the  perturbation  is  deter- 
mined by  the  sign  of  p if  p is  real,  and  by  the  sign  of  r if  p is  ima- 
ginary: if  this  sign  is  plus,  then  the  perturbation  increases,  and  if 
it  is  minus,  it  decreases.  We  arrive  at  the  following  conclusions.  If 
all  roots  of  the  characteristic  equation  (25)  have  a negative  real  part 
(in  particular,  they  can  be  real  and  negative),  then  the  equilibrium 
state  (xQi  y0)  is  stable  in  the  sense  of  Lyapunov.  Besides,  for  small 
Ax0,  A y0  it  will  then  be  true  that  x(t)  ->  x0,  y(t)  ->  y0  as  t oo; 
as  we  pointed  out  in  Sec.  7.6,  this  stability  is  said  to  be  asymptotic . 
Now  if  at  least  one  of  the  roots  of  equation  (25)  has  a positive  real 
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part,  then  the  equilibrium  state  under  consideration  is  Lyapunov 
unstable  (unstable  in  the  sense  of  Lyapunov). 

We  derived  these  results  from  the  system  (24),  but  by  the  fore- 
going the  same  assertions  hold  for  the  complete  system  (23).  Note 
that  if  equation  (25)  has  a double  root,  this  does  not  violate  our 
assertions  because  even  though  t appears  in  our  solution  as  a factor, 
the  exponential  ept  tends  to  zero  for  p < 0 faster  than  t tends  to 
infinity. 

The  two  conclusions  given  above  do  not  embrace  the  case  where 
there  are  no  roots  of  equation  (25)  with  a positive  real  part,  but 
there  is  at  least  one  root  with  a zero  real  part.  Then  in  the  general 
solution  of  system  (24)  there  appear  functions  of  the  form 

eiH  = cos  st  + i sin  st,  | eut  \ — 1 or  eot  = 1 

The  entity  would  appear  to  be  oscillating  (or  remaining  stationary) 
about  the  equilibrium  state  and  not  striving  towards  it.  But  then, 
due  to  the  unlimitedness  of  time,  the  discarded  higher-order  terms 
begin  to  exert  an  influence  and  are  capable  of  upsetting  the  stability. 
Thus,  in  this  special  case,  one  cannot  judge  the  stability  or  instability 
of  the  equilibrium  state  on  the  basis  of  the  roots  of  equation  (25). 
This  can  only  be  done  by  invoking  supplementary  considerations, 
for  instance,  more  terms  in  the  expansion  (23).  We  will  not  carry  out 
this  investigation  and  will  merely  note  that  small  perturbations 
will  increase  or  decay  much  more  slowly  since  the  variation  time 
of  such  a perturbation  will  be  a given  number  of  times  (say,  e)  inversely 
proportional  to  this  perturbation  or  will  even  have  a higher  order. 

Exercise 

Find  the  equilibrium  states  for  the  system 

* - y> 

y + y 3 

and  investigate  them  for  stability. 

8.5  Constructing  approximate  formulas  for  a solution 

Methods  for  constructing  approximate  formulas  for  the  solution 
of  a differential  equation  are  largely  analogous  to  those  methods 
described  in  Sec.  1.4  for  the  solution  of  “finite”  equations.  For  the 
sake  of  simplicity  we  consider  first-order  equations,  although  the  same 
methods  can  naturally  be  extended  to  equations  of  any  order  and  to 
systems  of  equations. 


dx 

dt 

dy 

dt 
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We  begin  with  the  method  of  iteration.  Suppose  we  are  consider- 
ing a first-order  differential  equation  with  specified  initial  condition: 


y =f(x’y)- 

dx 

yix  o)  = j'o 


(27) 


Taking  the  integrals  of  both  sides  of  the  equation,  we  get 


That  is, 

X 

y(x)  = J’o  + j/(s,  y(s))  ds  (2S) 

*0 

Equation  (28)  is  equivalent  at  once  to  both  the  equations  of  (27), 
since  after  its  differentiation  we  get  the  first  equation,  and  after  the 
substitution  x = x0  the  second  equation.  Equation  (28)  is  an  integral 
equation  because  the  unknown  function  is  under  the  integral  sign. 
Since  it  includes  not  only  the  first  but  also  the  second  equation  of 
(27),  it  has  a unique  solution  and  not  an  infinity  of  solutions,  as  the 
differential  equation  does. 

The  aspect  of  (28)  is  convenient  for  using  the  iteration  method 
(compare  with  equation  (22)  of  Ch.  1),  although  the  unknown  here  is 
a function,  not  a number.  Choosing  a function  y0(x)  for  the  zero- 
order  approximation  (it  is  desirable  that  it  be  as  close  as  possible  to 
the  desired  solution ; if  we  know  nothing  about  this  solution,  we  can 
at  least  put  yQ[x)  =y0),  we  obtain  the  first-order  approximation  via 
the  formula 

X 

y'i(x)  = yo  + J/(s>  J'o(s)) ds 

*0 

Substituting  the  result  into  the  right  side  of  (28),  we  get  the  second- 
order  approximation,  and  so  forth.  Generally, 

* 

y»+ 1(*)  = y0  + J/(s,  ds  (»  = o,  i,  2, ...) 

*0 


X 

q{x)  dx,  but  here  we  must  be  more  careful 

and  distinguish  between  the  upper  limit  and  the  variable  of  integration  (cf- 
HM,  Sec.  2.14). 


We  frequently  use  a notation  like 
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As  in  Sec.  1.4,  if  the  process  of  iteration  converges,  that  is,  if  the  se- 
quence of  approximations  tends  to  a certain  limit  function  with 
increasing  n,  then  it  satisfies  the  equation  (28). 

It  is  a remarkable  fact  that  the  iteration  method  for  (28)  converges 
at  least  for  all  * sufficiently  close  to  x0.  This  is  due  to  the  fact  that 
when  computing  subsequent  approximations  it  is  necessary  to 
integrate  the  preceding  ones,  and  in  successive  integration  the  func- 
tions are  as  a whole  “smoothed”,  any  inaccuracies  due  to  the  choice 
of  the  zero-order  approximation,  roundoff  errors,  and  the  like  being 
gradually  eliminated. 

(In  contrast,  in  the  case  of  successive  differentiation,  the  functions, 
as  a rule,  deteriorate,  the  initial  inaccuracies  increase,  and  so  an  ite- 
ration method  based  on  successive  differentiation  would  not  yield 
a convergence.  Cf.  similar  reasoning  in  Sec.  2.2.) 

By  way  of  an  illustration,  let  us  consider  the  problem 


Integration  yields 


/ = * 2 + y, 

y(  0)  = 1 


y{x)  = 1 + ( y2(s)  ds 


(29) 


For  the  zero-order  approximation  to  the  desired  solution,  of 
which  we  know  nothing  as  yet,  take  the  function  y0(x)  = 1,  for  it 
at  least  satisfies  the  initial  condition.  Then,  by  writing  out  powers 
up  to  x*  inclusive,  we  get  (verify  this) 

y i(*)  = 1 + * + y ’ 


y2{x)  = 1 + -f +jj(l  + S+  j) ~ds  = 1 + x + *2 

0 

+ 7*3  + 7*4  + 

3 o 

y3(X)  = l + X + X2  + 1 X3+  ^ X*  + ..., 

3 o 

yt(x)  = 1+  * + *2  + 4 *3  + t*4  + •••* 

3 o 

ys{x)  = 1 + * + *2  + ^ x3  + r xi  + ... 

The  graphs  of  the  successive  approximations  are  shown  in  Fig.  95, 
where  the  dashed  line  is  the  exact  solution.  It  is  clear  that  for  small 
| x | the  process  converges. 
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Fig.  95 


The  question  of  when  to  terminate  the  approximation  is  ordi- 
narily settled  by  comparing  adjacent  approximations. 

Another  approximate  method  is  based  on  the  fact  that  one  finds 
the  values  of  y'(xQ)t  y"(x 0)>  etc.  from  the  given  conditions  (27)  with 
the  aid  of  differentiation  and  then  expands  the  solution  in  a Taylor 
series  (see  HM,  Sec.  3.17).  The  requisite  number  of  terms  is  deter- 
mined in  the  form  of  a succession  of  calculations  and  their  comparison 
with  the  desired  degree  of  accuracy. 

Let  us  consider  the  problem  (29).  Substitution  into  the  right 
side  of  the  equation  yields  y'( 0)  = 02  + l2  — L Differentiating  both 
sides  of  the  equation,  we  get  yn  = 2x  + 2 yy'  and,  substituting  x = 0, 
we  obtain  y”( 0)  = 2*0  + 2*  1*  1=2.  In  similar  fashion  we  get 

/"  = 2 + 2/2  + 2 yy",  yf/,{  0)  = 8, 
yv  = 6y'y"  + 2 yy"',  yIV(0)  = 28 
and  so  on.  Substituting  this  into  the  Taylor  formula,  we  obtain 

y = Mo)  + x + — x2+  ... 

= 1 + X + x2+ -X*  + - X4  + ...  (30) 

3 6 
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We  have  the  same  formula  as  that  obtained  via  the  method  of 
successive  approximations.  This  formula  can  only  be  used  for  small 
| x | ; for  example,  when  x = 1 the  series  (30)  (like  the  earlier  described 
iteration  method)  diverges.  It  can  be  demonstrated  that  this  is  due 
to  the  essence  of  the  problem  at  hand.  Indeed,  consider  the  solution 

y1{x)  of  the  equation  — = y2  for  the  initial  condition  y'(0)  = 1. 

dx 

Since  x2  + y2  > y2,  the  direction  field  in  the  xy- plane  that  deter- 
mines the  original  solution  y{x)  is  located  more  steeply  than  the  direc- 
tion field  that  determines  the  solution  yx{x).  But  y(0)  = jd(0),  and 
so  for  x > 0 the  curve  y = y(x)  passes  above  the  curve  y = yx(x). 
The  solution  yx(x)  is  easy  to  find  by  separating  the  variables,  yx{x)  = 

= — — . Thus,  v(x)  > — — (x  > 0).  The  right  side  tends  to  infinity 

1 — X ' 1 — X 

as  x ->  1 — 0 ; hence  the  solution  y(x)  also  tends  to  infinity  for  a certain 
x = xx  ^ 1 as  x increases  from  0.  Computing  the  value  of  x2  (which 
depends  on  the  initial  value  y(0))  via  the  methods  of  Sec.  8.7  yields 
x2  = 0.959.  As  x — > xA  — 0 we  will  have  x2  y2  and  so  y(x)  « 
1 / ( x ^ x) . 

Closely  tied  to  this  method  is  the  method  of  using  power  series 
with  undetermined  coefficients.  It  consists  in  seeking  the  solution 
of  an  equation  in  the  form  of  a series  with  unknown  coefficients: 

y = a + b(x  — x0)  + c(x  — x0)2  + d[x  — x0)3  + ... 

which  are  found  by  substitution  into  the  equation,  subsequent  equat- 
ing of  the  coefficients  of  equal  powers,  and  the  use  of  the  initial 
condition  if  it  is  given. 

Apply  the  method  of  undetermined  coefficients  to  the  problem 
(29).  Since  x0  = 0,  we  write 

y ~ a + bx  + cx2  + ^x3  + ex*  + ...  (31) 

Substituting  x = 0,  we  get  a = 1 by  virtue  of  the  initial  condition. 
Before  substituting  the  series  (31)  into  equation  (29),  it  is  convenient 
to  expand  the  right  member  of  this  equation  in  a series  in  powers 
of  y — 1.  (In  the  general  case,  the  right  member  is  expanded  in  a 
Taylor  series  in  powers  of  x x0,  y — y0  by  the  formula 

A*’  y)  =/o  + Ux) o(*  - *o)  + (/y)o(y  - 3'o)  + ... 

where  the  zero  subscript  denotes  substitution  of  the  values  x = x0, 
y = J'o-)  We  get 

y'  = x*  + [(y  - 1)  + 1]*  = x*  + 1 + 2 (y  -1  ) + (y-  l)2 
Substituting  the  series  (31),  we  obtain 
b + 2 cx  + 3 dx 2 + 4ex3  + ... 

= 1 + x2-\-  2 (bx  + cx2  + dx3  + ex*  + •••)  + (bx  + cx2  + dx3  + ...)2 
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Removing  brackets  on  the  right  and  collecting  terms,  and  then  equat- 
ing the  coefficients  of  equal  powers  of  we  arrive  at  the  relations 
b = 1,  2c  — 2b,  2>d  = 1 + 2c  + b2,  Ae  ~ 2d  + 2 be,  ...,  whence  in  suc- 

4 7 

cession  we  find  b ~ 1,  c = 1,  d = — , c = — , ...  . Putting  these 

3 6 D 

values  into  (31),  we  again  arrive  at  the  series  (30). 

The  perturbation  method  discussed  in  Sec.  1 A is  also  used  in 
solving  differential  equations.  Here  are  some  examples. 

The  problem 

/ = — . y(0)  = 0 (32) 

1 -f  0.1  xy 

does  not  contain  any  parameters.  But  we  can  consider  the  more 
general  problem 

3/  = -J—,  y(0)  = 0 (33) 

1 + oixy 


from  which  (32)  is  obtained  when 


0.1.  The  problem  (33)  is 


readily  solved  for  oc  = 0:  we  then  get  y = Therefore  we  seek  a 
solution  by  a series  expansion  in  powers  of  a,  i.e. 

y = j + cm  -)-  erv  + a 3w  + ...  (34) 

where  u = u{x),  v = v{x)  and  so  on  are  the  as  yet  unknown  func- 
tions of  x. 

Substituting  (34)  into  (33)  yields,  after  multiplying  by  the 
denominator, 

(x  + a ur  + aV  + a3*#'  + •■•)  ^1  + x3  + cl2xu  + oc3xv  -f-  ...|  = x , (35) 

au(0)  + oc2v(0)  + ...  = 0 
or 

u(0)  = 0,  *>(0)  = 0,  w(0)  = 0 (36) 

Opening  brackets  in  (35)  and  equating  to  zero  the  coefficients  of 
the  powers  of  a,  we  successively  get 

u'  + — x*  = 0,  v9  + — u'  + x2u  = 0, 

*2  2 


wf  A v9  + xuu’  + x2v  = 0 and  so  on 

2 

whence,  taking  into  account  (36),  we  get  (verify  this!) 

#5  7 71 

> v = — x8,  w — x11  and  so  on 

10  160  1760 


u = 
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Therefore  formula  (34)  yields 

x2  a 5 - 7a3  o 71a3  , , . 

v — xs  -{ xs x11  4- 

2 10  160  1760 


In  particular,  for  equation  (32)  we  obtain 

lx 8 71*“ 


* 2 x5 

y ~ 2 ~~  100  ’ 


16  000 


1 760  000 


+ ... 


This  series  converges  excellently  for  | x | ^ 1 and  rather  decently  for 
1 < | # | < 2.  * 

Now  consider  the  problem 

y'  = sin  (xy),  y{ 0)  = a (37) 

Unlike  the  preceding  example,  the  parameter  here  enters  into  the 
initial  condition.  For  a = 0 the  problem  (37)  has  the  solution  y = 0. 
And  so  for  small  | a | we  seek  the  solution  in  the  form 

y = cat  + a2z;  + a + ...  = u(x),  v = v(x),  ...)  (38) 

Substitution  of  the  value  x = 0 produces 

u( 0)  = 1,  v(0)  = 0,  w{ 0)  = 0 (39) 


On  the  other  hand,  substituting  (38)  into  the  differential  equation 
(37),  we  get  (taking  note  of  the  Taylor  series  for  the  sine) 


<xu'  + orv ' -f"  + 


(a  xu  + a2  xv  -f  v?xw  + ...) 
1! 


(a.xu  + CL2  XV  + a zxw  + ...)3 
3! 


+ ••• 


Equating  coefficients  of  like  powers  of  a gives 


u = xu,  V = XV , W — xw  ■ 


3! 


Integrating  these  linear  equations  with  account  taken  of  the 
initial  conditions  (39),  we  find 

u = e 2,  v = 0,  w = • — (1  — x^)  x e 2 

12 V ' 12 


(verify  this!).  Substitution  of  these  expressions  into  (38)  produces 
an  expansion  of  the  desired  solution  that  is  suitable  for  small  |a|. 


The  substitution  s = — tf.xy,  v(s)  — — a*3,  which  we  leave  to  the  reader,  car- 
ries (33)  into  an  equation  that  does  not  involve  a parameter.  When  s = 1 we 
have  | dyjdx  J = oo.  Numerical  integration  via  the  methods  of  Sec.  8.7  shows 
that  ^(1)  = 1.087,  and  so  the  series  (34)  cannot  converge  for  * > 1.087/a; 

for  a = 0.1  the  right  member  is  equal  to  2.16. 
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Here,  the  greater  \x\,  the  greater  the  coefficients,  and  for  this  reason 
the  smaller  the  interval  of  values  of  a for  which  the  series  is  appli- 
cable. 

In  more  complicated  cases  of  the  use  of  the  perturbation  method 
it  is  often  useful  to  find  at  least  the  first  term  of  the  expansion  involv- 
ing the  parameter,  since  this  gives  us  an  idea  of  the  behaviour  of 
the  solution  for  a small  change  in  the  parameter. 

We  conclude  with  the  remark  that  in  practice,  particularly  in 
the  case  of  crude  estimates,  wide  use  is  made  of  simplifying  the 
original  equation  by  dropping  comparatively  small  terms,  replacing 
slowly  varying  coefficients  with  constants,  and  the  like.  After  such  a 
simplification  we  might  get  an  equation  of  one  of  the  integrable  types, 
which,  when  integrated,  yields  a function  that  can  be  regarded  as 
an  approximate  solution  of  the  original  complete  equation.  At  any 
rate,  it  frequently  conveys  the  proper  nature  of  the  behaviour  of 
the  exact  solution.  With  this  “zero-order  approximation”  found,  it 
is  sometimes  possible,  using  it,  to  introduce  corrections  that  take  into 
account  the  simplification  and  thus  to  find  the  “first-order  approxi- 
mation”, and  so  on. 

If  the  equation  contains  parameters  (for  example,  masses,  li- 
near dimensions  of  the  entities  under  study,  and  the  like),  the  one 
has  to  bear  in  mind  that  for  certain  values  of  these  parameters  cer- 
tain terms  of  the  equation  may  be  relatively  small,  and  for  other 
values,  other  terms  will  be  small,  so  that  the  simplification  is  handled 
differently  for  different  values  of  the  parameters.  Besides,  it  is  some- 
times necessary  to  split  the  range  of  the  independent  variable  into 
parts,  over  each  one  of  which  the  simplification  procedure  differs. 

Such  a simplification  of  the  equation  is  particularly  useful  when— 
in  the  very  derivation  (writing)  of  the  differential  equation  — essen- 
tial simplifying  assumptions  are  made  or  the  accuracy  to  which  the 
quantities  in  question  are  known  is  slight.  For  example,  terms  of 
the  equation  that  are  less  than  the  admissible  error  in  other  terms 
must  definitely  be  dropped. 

To  illustrate,  consider  the  problem 

y"  + y + 0.2y3  = 0,  y( 0)  = 1,  /( 0)  = 0,  0 < x < 2 (40) 

1 + 0.1# 


Since  the  coefficient  of  y varies  slowly,  we  replace  it  by  its  mean  value : 

= k ; x = 0,  k — 1 ; x = 2,  k = — = 0.83 ; 

1 -j-  0.1#  1-2 

i = L±£2?  = o.92 


Also,  we  drop  the  relatively  small  third  term  and  get  the  equation 

y"  + 0.92  y = 0 
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with  the  solution  for  the  given  initial  conditions 

y = cos  0.96#  (41) 

The  aspect  of  this  approximate  solution  confirms  the  justification 
for  dropping  the  last  term  of  the  equation:  the  ratio  of  the  third 
term  to  the  second  is  of  the  order  of  0.2  y2  < 0.2  and  therefore  the 
sum  of  the  first  two  terms  is  small  in  comparison  with  the  second, 

i.e.  the  first  term  and  the  third  should  "almost  cancel  out”. 

We  now  introduce  a correction  for  the  last  term  by  substituting 
into  it  the  approximate  solution  (41)  and  leaving  the  averaged  coeffi- 
cient : 

y"  + 0.92 y = —0.2  cos3  0.96a; 

For  the  given  initial  conditions,  the  integration  of  this  equation  by 
the  method  of  Sec.  7.5  yields 

y = 0.99  cos  0.96#  — 0.08#  sin  0.96#  + 0.01  cos  2.88# 

The  difference  as  compared  to  the  zero-order  approximation 
(41)  is  slight,  so  that  the  conclusion  concerning  the  significance  of 
the  separate  terms  in  (40)  remains  valid;  at  the  same  time  the  third 
term  of  (40)  made  its  contribution  to  the  solution.  (To  take  into 
account  the  variable  nature  of  the  coefficient  k one  could  replace 
the  second  term  in  equation  (40)  by  [1.1  +0. 1(#—  1)]_1  y = 
1 

= — [1  + 0.91(#  — l)]"1  y &0.9\y  — 0.08(#  — 1)  cos  0.96#.  Yet  even 

this  would  not  lead  to  a substantial  change  in  the  solution.) 

Reasoning  in  this  fashion  is  frequently  rather  nonrigorous  (and 
at  times  leads  to  errors)  but  if  it  is  coupled  with  common  sense,  it 
rather  frequently  produces  solutions  that  are  of  practical  use. 

Exercises 

1.  Apply  the  method  of  successive  approximation  to  the  problem 

j-  = y,  y(o)  = i. 

dx 

2.  By  calculating  derivatives,  find  the  expansion,  in  powers 
of  x,  of  the  solution  of  the  problem  — = exy,  y(0)  = 0 up  to  x5. 

dx 

3.  Find  the  first  two  terms  of  the  expansion  of  the  solution  of  the 
problem  — = y2  + a#,  y(0)  = 1 in  a series  of  powers  of  a. 

dx 

8.6  Adiabatic  variation  of  a solution 

We  now  consider  another  important  method  of  approximate 
solution  of  differential  equations,  this  time  using  the  example  of 
the  linear  equation 

X + CO“#  — 0 (#  = x(t),  co  = G>(*)) 


(42) 
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where  a dot  indicates  the  derivative  with  respect  to  the  time  t,  and 
the  relationship  of  co(^)  is  given.  This  is  the  equation  of  oscillations 
of  an  oscillator  whose  parameters  vary  in  time;  for  example,  this 
may  be  a pendulum  the  length  of  the  suspension  of  which  varies, 
and  the  like. 

In  the  general  case,  equation  (42)  is  not  integrable  by  quadra- 
ture and  its  investigation  is  rather  complicated.  However,  in  an  im- 
portant particular  case,  namely  when  the  coefficient  co(£)  >0  varies 
slowly,  such  an  investigation  can  be  carried  out.  Here  the  concept 
of  slow  variation  is  made  explicit  as  follows.  To  begin  with,  let  co  = 
= constant;  then  in  Sec.  7.3  we  saw  that  co  serves  as  the  frequency 
of  free  oscillations  of  the  oscillator,  and  so  we  have  a natural  measure 

of  time  equal  to  the  period  — of  these  oscillations.  We  agree  to 

CO 

say  that  a quantity  p(t)  varies  slowly  (we  also  say  adiabatically) 
if  its  relative  change  during  this  period  is  small,  that  is,  if 


P 


2tt 


CO 


<^\p\  or,  what  is  the  same  thing,  \p\  co  \ p\ 


We  will  use  this  definition  for  the  case  co  = co(tf)  as  well;  the  adia- 
batic nature  of  the  variation  of  co  means  that  |cb  | <co2 

If  co  = constant,  then  the  general  solution  of  equation  (42) 
may  be  written  in  the  form 

x — Cx  cos  co t + C2  sin  co£  = A sin  (co t + cp0) 
where  A and  <p0  are  arbitrary  constants.  In  short, 

% — A sin  9 (43) 


where  9 = co£  + 90  and  therefore 


dy 


dt 


co 


(44) 


Now  if  co  depends  on  t but  varies  slowly,  then  it  is  natural  to 
assume  that  over  each  small  time  interval  the  oscillations  of  the  oscilla- 
tor are  nearly  harmonic  with  frequency  equal  to  the  current  value  of 
co  and  to  assume  that  the  solution  of  equation  (42)  is  still  and  all 
of  the  form  (43)  under  the  condition  (44),  where  however  we  already 

t 

have  A = A(t),  co  = co(tf).  From  (44)  we  get  9 = ^c &(t)dt,  the  con- 
stant lower  limit  of  this  integral  being  inessential. 

Suppose  that  not  only  co  but  also  cb  varies  slowly;  then  it  is 
natural  to  assume  that  A and  A also  vary  slowly.  From  (43)  we  get 

x — A sin  9 A cos  9 • co 

x = A sin  9 + 2^4  cos  9 ■ co  — A sin  9 • co2  + A cos  9 • cb 
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and  substitution  into  (40)  produces 

A sin  9 + 2A  cos  o • co  + A cos  o * to  = 0 (45) 

Since,  by  assumption  the  first  term  is  of  higher  order  than  the 
second,  the  second  and  third  terms  should  cancel  out.  After  cancelling 
out  cos  9,  we  get 

2Ao>  + Aoj  = 0,  i.e.  2 — to  + A — = 0,  2 — + — = 0 

dt  dt  A co 

and  after  integrating  we  obtain  2 In  A -f  In  co  = In  C,  A2  co  = C. 
Thus  we  see  that,  on  our  assumptions,  the  amplitude  of  oscillations 
varies  in  inverse  proportion  to  the  square  root  of  the  value  of  the  na- 
tural frequency. 

True,  since  in  equation  (45)  we  neglected  the  higher  order  term, 
the  expression  A2  co  is  actually  not  constant.  It  varies  with  time, 
but  the  relative  rate  of  its  variation  is  of  a higher  order  compared 
with  the  relative  rate  of  variation  of  the  natural  frequency.  We  say 
that  the  quantity  A2co  is  an  adiabatic  invariant . 

A similar  result  is  also  obtained  by  energy  reasoning.  The  energy 
of  an  oscillator  is  equal  to  (see  formula  (5)) 

E = — + (^  + W2*2) 

2 2 2 V 7 

whence,  differentiating  and  using  equation  (42),  we  get,  for  m — 
constant, 

E = (2xx  + 2coc ox2  + 2(a2xx)  = racoo)#2  (46) 


If  co  and  co  vary  slowly,  then  the  coefficients  of  x2  in  the  right  members 
of  the  current  period,  that  is,  over  a time  interval  of  length  — , are 

co 

almost  constant  and  we  can  average  over  that  interval.  Taking  into 
account  sin2(co£  + <p0)  = cos  2(corf  + <p0),  we  get  x2  = ~^A2 


and,  similarly, 


x2  = — co2A2.  And  so  after  averaging  we  have 


E = —wcoM2,  E = -i  mcocoA2  (47) 

(since  E varies  slowly,  we  can  replace  E again  by  E).  From  this  we 
have  — = — » In  E — In  co  + In  Cv  E = Cx co,  which  is  to  say  that 

E co 

the  energy  of  the  oscillator  is  directly  proportional  to  the  instan- 
taneous value  of  its  natural  frequency.  Putting  this  result  into  the 


288  Differential  equations  continued  CH.  8 

1 2 C 

first  equation  of  (47),  we  get  = — m<&2A2,  i.e.  oA2  = — - = 

2 

constant,  as  above. 

It  is  interesting  to  note  that  the  resulting  proportionality,  Eoco. >, 
is  readily  grasped  from  the  standpoint  of  quantum  mechanics.  The 
point  is  that  the  energy  of  one  quantum  is  equal  to  fico,  where 
ft  « 10-27  g-cm/s2  is  Planck's  constant;  the  energy  of  an  oscillator 
at  the  nth  level  (n  = 1,2,  3,  ...)  is  equal  to  from.  If  the  frequen- 
cy g>  varies  slowly,  the  oscillator  remains  all  the  time  at  the  same 
level,  that  is,  n — constant,  whence  the  proportionality  E oc  co.  (See, 
for  instance,  P.  Paradoksov,  “How  Quantum  Mechanics  Helps  Us 
Understand  Classical  Mechanics."  [12]) 

Now  let  us  consider  another  important  special  case  where  co 
and  not  co  varies  slowly;  we  take  the  example  of  equation  (42)  with 
co  = co0  a sin  kt,  where  | a | co0  and  the  constant  k is  of  the  same 
order  as  co0.  In  this  case,  when  averaging  the  right  member  of  (46),  it 
should  first  be  transformed  via  the  formula 

m(A0<xk  cos  kt * ” cos  2(co0^  + <p0)j 

= — moy0cckA2  cos  kt Mu>0<xkA2  cos  [(2co0  -j-  k)  t + 2<p0] 

2 4 

— ~ moj0ockA2  cos  [(2co0  — k)  t + 2o0] 

There  can  be  two  cases  now.  If  k =^=  2co0,  then  the  mean 
value  of  the  right  member,  as  the  mean  value  of  the  sum  of  pure 
harmonics,  is  zero,  E = 0,  with  E the  adiabatic  invariant.  But  if 
k = 2co0,  then  the  last  term  in  the  right-hand  member  turns  into  a 

constant,  whence,  after  averaging,  we  get 

E ~ ~ mco0a2co0.42  cos  (2<p0)  = — a cos  (2o0)  E}  or 

E = Ce- «os(2Vo)-^ 

To  summarize,  in  the  case  at  hand,  the  energy  of  the  oscillator, 
and  so  also  the  amplitude  of  oscillations,  are  exponentially  increasing 
or  decreasing  in  time,  depending  on  the  sign  of  cos  2cp0.  This  pheno- 
menon, which  is  similar  to  resonance  (Sec.  7.5)  and  occurs  due  to 
the  periodic  variation  of  the  parameters  of  the  oscillator,  is  called 
parametric  resonance  (incidentally,  parametric  resonance  is  used  to 
set  an  ordinary  child's  swing  in  motion). 

8.7  Numerical  solution  of  differential  equations 

It  often  happens  that  it  is  impossible  to  obtain  an  exact  or  suffi- 
ciently satisfactory  approximate  solution  in  the  form  of  a formula. 
Then  a numerical  solution  is  found  in  which  the  desired  particular 
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solution  (for  concrete  values  of  the  parameters  if  they  enter  into  the 
statement  of  the  problem)  is  constructed  in  tabular  form.  The  prin- 
ciple of  a numerical  solution  of  a differential  equation  is  exceedingly 
simple  and  follows  directly  from  the  meaning  of  a derivative. 

Suppose  an  equation  is  of  the  form  — =/(#,  y)  and  the  ini- 

dx 


tial  condition  y = y0  for  x = x0  is  given.  Then,  putting  the  values 
x0,  y0  into  the  function  f(x,  y),  we  find  the  derivative  at  the  point  x0 : 


dy_ 

dx  \x=xa 


=/(*o.  y o) 


From  this,  assuming  that  Ax  is  a small  quantity,  we  get 


y{x0  -f  Ax)  = >’(*!)  = yi  = y0  + Ay  = 

• A*  = y0  + f{x0,  y0)  ■ Ax 

x~  xQ 

Writing  f(x0,  y0)  = f0  for  brevity,  we  give  this  result  as  follows: 

yi  = y0+f0- (48) 

Now,  taking  the  point  (xv  j’j)  for  the  original  one,  we  can  obtain 
y2  = y(x 2),  where  x2  — xx  -j-  Ax , in  exactly  the  same  manner.  Thus, 
step  by  step,  we  can  calculate  various  values  of  y for  distinct  values 
of  x.  This  is  Euler  s method . 

Of  course  this  method  provides  approximate  values  of  y,  not 
exact  values,  for  the  derivative  — does  not  remain  constant  over 

dx 

the  interval  from  x = x0  to  x = xv  Therefore,  by  using  formula  (48) 
we  err  in  determining  y,  and  the  error  is  the  greater,  the  larger  Ax  is. 

To  be  more  exact,  since  the  right  side  of  (48)  is  a sum  of 
the  first  two  terms  of  the  expansion  ofy(x0  + Ax)  in  powers  of  Ax, 
the  error  of  (48)  is  of  the  order  of  ( Ax )2,  i.e.  it  does  not  exceed  a(Ax )2, 
where  the  coefficient  a depends  on  the  type  of  function  f(x,  y). 

Suppose  it  is  necessary,  if  we  know  the  initial  condition  y(x0)  = 
— y0,  to  obtain  the  value  of  the  solution  for  x — x0  + /,  where  l 
is  great,  so  that  when  using  (48)  and  putting  Ax  = l in  it,  we  allow 
for  an  enormous  error.  To  find  y(x0  + l)  we  partition  the  interval 
between  x = x0  and  x = x0  + l into  n equal  small  subintervals : 

then  the  length  of  each  subinterval  is  Ax  = — . To  obtain  y(xQ  + l) 

n 

we  will  have  to  take  n steps,  finding  in  succession  v(*0  + — j,  y^x0  + 
+ 2-),  ...,y(x0  + l). 

At  each  such  step  the  error  is  of  the  order  of  j , and  for 
the  n steps  we  have  an  error  of  the  order  of  af— ) n = — . Hence,  the 

\n)  n 


. dy 

= y°+j 

dx 


19-1634 
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error  that  arises  when  using  Euler’s  method  is  inversely  proportional 
to  the  number  of  steps.  If  the  accuracy  e is  specified,  then  the  neces- 
sary number  of  steps  n is  a quantity  of  the  order  of  — . Clearly,  the 

e 

larger  the  number  of  steps,  the  smaller  the  error  and  the  more  exact 
the  quantity  y(xQ  + l)  that  we  find.  But  the  error  decreases  very 
slowly  with  increasing  number  of  steps.  For  this  reason,  a great  num- 
ber of  steps  is  needed  in  order  to  attain  a specified  accuracy. 

We  say  that  the  approximate  value  of  y given  by  formula  (48) 
is  the  first  approximation  (yi),  so  that  yi  = y0  + f0  Ax.  To  obtain 
a more  exact  second-order  approximation  we  will  take  the  arith- 
metic mean  of  the  derivative  at  the  beginning  and  at  the  end  of  the 
interval,  computing  the  derivative  at  the  endpoint  of  the  interval 
by  means  of  the  first  approximation  yi.  Thus 


or 


\x  = x9  \x  = xa 

y=ya  y=>'i 


) Ax 

+ Ax! 


yu  = y0  + \ [/(*„.  J'o)  +f(* o + A*.  j'i)J  A* 

= y0  + \[fa  +/(* o + Ax>  y0  +fo  A*)l Ax 


It  can  be  shown  that  yu  has  an  error  of  the  order  of  b{ A#)3, 
where  b is  a constant  dependent  on  the  type  of  f(x , y).  Therefore, 
the  total  error,  over  n steps,  in  determining  y(x0  + l)  will  be  e = 

(l  \3  bP 

— I • n = — and  the  number  of  steps  n needed  to  attain 
n ) n2 

the  given  accuracy  £ is  a quantity  of  the  order  of  ^ . In  this 

case  the  error  is  inversely  proportional  to  the  square  of  the  number 
of  steps,  that  is,  as  the  number  of  steps  increases  it  falls  off  much 
faster  than  when  the  first  approximation  yx  is  used. 

Note,  however,  that  in  finding  yn  we  have  to  compute  f(x , y) 
twice  at  each  step,  whereas  in  finding  yi  we  have  to  calculate  only 
one  value  of  f(x,  y)  at  each  step.  Indeed,  using  the  Euler  method, 
we  begin  the  calculation  with  the  point  (x0,  y0)  and  find  yi(x^j  = 
= y0  +/o  * Ax  and  then  compute  f(xly  y i(xj))  and  pass  to  the  next  step. 

If  we  want  to  find  a second  approximation,  the  scheme  is  as  fol- 
lows. First  find y i (xx)  = y0  + f0  *A  x and  then  determine  f(xltyi{x^j)  and 

fo  = \[f(* o.  y0)  +fixi>  yi(xi))] 

We  then  find  >'n(^1)  = y0  + ?0-  Ax  and,  finally,  f{xlt  yn(^x)) 
Only  then  is  everything  ready  for  the  next  step.  This  calculation 
scheme  is  called  the  recalculation  method  because  the  quantity  f(x,  y) 
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is  recalculated  at  each  step  and  is  replaced  by  the  more  reliable  quan- 
tity f(x,  y). 

Computation  of  the  values  of  f(x,  y)  is,  as  a rule,  the  most  cum- 
bersome operation  (the  other  operations  — multiplying  by  Ax  and 
addition  — are  done  much  faster),  so  that  the  amount  of  work  done 
on  n steps  in  this  scheme  involving  recalculation  is  equivalent  to  the 
work  needed  for  In  steps  in  the  scheme  for  calculating  yz.  Despite 
this,  however,  if  high  accuracy  is  needed,  that  is,  if  e is  very  small, 

then  the  recalculation  scheme  is  better  since  2 if  eis  small. 

This  scheme  has  yet  another  advantage.  In  it  there  is  a good 
check  of  the  computations  and  of  the  choice  of  the  quantity  Ax, 
which  is  called  a step : it  is  clear  that  a calculation  is  good  only  inso- 
far as  the  values  of  f(x,  y)  and  f(x , y)  differ  but  slightly. 

Let  us  consider  an  example.  Suppose  y is  a solution  of  the  equa- 
tion y'  = x2  — y2  with  the  initial  condition  y = 0 when  x — — 1 . 
Let  us  determine  the  solution  for  x = 0.  We  take  advantage  of  the 
recalculation  scheme  by  taking  the  step  Ax  — 0.1. 

The  calculations  are  given  in  Table  5.  The  intermediate  results 
are  given  in  the  second  and  third  columns  of  the  table  in  parentheses, 


Table  5 


X 

y 

II 

f 

y exact 

-1.0 

0.0000 

1.0000 

0.0000 

(0.1000) 

(0.8000) 

0.9000 

-0.9 

0.0900 

0.8019 

0.0900 

(0.1702) 

(0.6110) 

0.7064 

-0.8 

0.1606 

0.6142 

0.1607 

(0.2220) 

(0.4407) 

0.5274 

-0.7 

0.2133 

0.4444 

0.2135 

(0.2577) 

(0.2936) 

0.3690 

-0.6 

0.2502 

0.2974 

0.2504 

(0.2799) 

(0.1717) 

0.2345 

-0.5 

0.2736 

0.1752 

0.2738 

(0.2911) 

(0.0753) 

0.1252 

-0.4 

0.2861 

0.0782 

0.2862 

(0.2939) 

(0.0036) 

0.0409 

-0.3 

0.2902 

0.0058 

0.2902 

(0.2908) 

(-0.0446) 

-0.0194 

-0.2 

0.2883 

-0.0699 

0.2882 

(0.2840) 

(-0.0706) 

-0.0568 

-0.1 

0.2826 

-0.0699 

0.2823 

(0.2756) 

(-0.0760) 

-0.0730 

0.0 

0.2753 

-0.0758 

0.2749 
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and  under  them  the  results  of  the  recalculation.  In  the  last  column 
are  the  values  of  y correct  to  four  decimal  places.  Comparing  them 
with  the  ones  we  obtained,  we  see  that  all  the  values  obtained  are 
correct  to  three  decimal  places. 

The  method  of  recalculation  is  amenable  to  further  refinement, 
the  result  of  which  are  the  presently  widely  used  computational 
methods  of  Runge-Kutta  and  Milne  that  may  be  found  in  texts  deal- 
ing with  numerical  methods.  The  method  of  Adams,  which  is  based  on 
finite  differences  (Sec.  2.1)  is  also  widely  used.  We  now  give  this 
method  in  a simplified  version  (ordinarily  it  is  carried  to  the  third 
differences,  but  we  will  confine  ourselves  to  second  differences).  We 
proceed  from  Newton's  formula  (Ch.  2,  (5))  applied  to  the  deriva- 
tive of  the  desired  solution,  y’{x),  and  in  place  of  k we  take  k — 1 : 


y'(x)  = y'k  + 


h — s f h — s 
h [ h 


— 1 


)• 


S = X — 

Integrating  this  equation  from  x = xk  to  x = xk+1 , that  is,  from 
s = h to  s = 2li,  we  get  (verify  this) 

*k+i 

^ y'{x)  dx  — yb+1  — yh  = y’kh  fr  tyk-  \ ~ + ^2yL-i  — h 

*k 

or 

y*+ 1 = y*  + [ylt  + ~ *yc-  \ + ^~2  S2y*-i) h (^9) 

This  formula  is  used  in  the  following  manner.  First,  via  some  pro- 
cedure (say,  with  the  aid  of  Taylor's  formula,  Sec.  8.5,  or  by  means 
of  the  recalculation  method),  we  find  the  values  yx  = y(x0  -f  h) 
and  y2  = y(x0  + 2 h).  Then  wre  calculate  the  corresponding  values 

y’o  = f(x0,  y0),  y[  = f(x1>  yj  = f(x0  + h,  yx),  y2  = /(*.,  v2) 

with  the  aid  of  which  we  get 

=y'i-  y0.  \ = y'z  — yl  &y'i  = — *yl 

2 2 2 2 

Furthermore,  assuming  in  (49)  k = 2,  we  compute  yz  and,  with  its 
aid, 

y*  — f{x 3>  y%)>  ^y%\  — >3  yz>  &2y%  — §34 1-  — §yif- 

Then,  putting  k = 3 in  (49),  we  compute  y4,  and  with  its  aid,  y[  = 
= f(x4,  y4),  and  so  on. 

Particular  care  is  needed  in  the  numerical  solution  of  differential 
equations  when  the  desired  function  can  increase  indefinitely.  For 
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instance,  suppose  we  have  the  equation  — = y2  and  the  initial 

dx 

condition  y = 1 for  a = 1. 

It  is  easy  to  find  the  exact  solution  of  this  equation.  Indeed, 
y * 

~ = dx,  when  = or  1 — — = a — 1,  which  yields  y = 
y 1^1  ^ 

— - — . It  is  clear  that  y increases  indefinitely  as  x approaches 

2 — x 

2.  If  we  were  to  solve  this  equation  numerically,  we  would  of 
course  get  a very  definite  though  perhaps  very  large  value  of  y for 
x — 2. 

We  found  the  value  of  the  argument  X = 2,  during  the  approach 
to  which  the  solution  increases  without  bound,  solely  due  to  the 
fact  that  we  could  write  the  desired  solution  in  explicit  form.  But 
many  equations  lack  such  a solution.  How  is  it  possible,  when  solv- 
ing an  equation  numerically,  to  find  that  value  of  x during  the  ap- 
proach to  which  the  solution  increases  indefinitely? 

What  do  we  do,  for  example,  with  the  Riccati  equation 

£ = ?(*)•  >•*  + <!»(*)  (50) 

dx 

which  for  arbitrary  9(a)  and  0(a)  cannot  be  solved  exactly?* 

We  introduce  a new  sought-for  function  z = — . Then  z = 0 

y 

for  the  value  of  x we  are  interested  in.  Note  that  — = — — — • 

dx  y2  dx 

Divide  (44)  by  — y2  to  get  — _L  = —9(A)  — which  is 

y2  dx  y2 

■j-  = — <?(*)  — 22- 
dx 

Solving  the  last  equation  numerically,  we  get  the  value  of  a 
at  which  z — 0. 

The  foregoing  methods  of  numerical  solution  of  differential  equa- 
tions are  readily  carried  over  to  the  case  of  a system  of  two  or  more 
equations.  Let  us  illustrate  this  in  the  case  of  a system  of  two  equa- 
tions: 

y = 9(X,  y,  z) 

dx 
dy 

* The  equation  = y2  just  considered  is  obtained  from  (50)  for  rp(A')  = 1, 
dx 

<M*)  s 0. 
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Suppose  we  have  the  initial  conditions  y = yQr  z = :r0  for  x = x0. 
Set  f(x0,  y0f  z0)  =/0,  <p{x0>  y0>  z0)  = ?0.  Then  the  first  approxima- 
tion is 

Vi(*i)  = 3'i(^0  + A*)  = v0  + /o*  A*, 

*i(*i)  = -i(*o  + Az)  = + 90-  A* 


To  obtain  a second  approximation,  we  use  the  first  approximation 
to  find  the  values  of  the  derivatives  at  the  endpoint  of  the  interval, 
i.e.,  for  x1  = x0  + Ax.  We  get 


dy 

dx 

dz 

dx 


* = *„  + A* 


x—Xq  + Ax 


= /i  =f(x o + Ax,  Vi,  Zj). 
= ?i  = ?(X0  + Ax,  Vj,  2j) 


We  then  determine  the  mean  values : 


f — y (/o  + /i)>  ? — y (?o  + ?i) 


The  second  approximation  is 

jyn  — Vo  + f ‘ Av,  xrn  = + 9 * Ax 

Finally,  having  obtained  yu  and  zU}  we  recalculate  the  values  of 
the  derivatives  at  x = x0  + Ax: 

— fn  ~ f{xo  + A#,  Vn,  2n) 

*=.r0  + A* 

dz  ^ 

— | = 9n  “ ?(xo  + A#,  j'n,  ^n) 

a*  |^r  = at0 -f- 

An  equation  of  the  second  or  higher  order  can,  as  was  shown  in 
Sec.  8.2,  be  reduced  to  a system  of  equations  of  the  first  order. 
Therefore  the  methods  of  numerical  solution  are  applicable  also  to 
equations  of  higher  than  first  order. 

Note  another  special  case.  Suppose  we  have  the  equation 

Unlike  the  general  case  of  a second-order  equation,  the  right  side 

here  does  not  contain  — 

dt 

Equation  (51)  admits  an  analytical  solution  in  two  cases. 


dy 

dx 


Such  an  equation  describes,  for  example,  the  motion  of  a body  under  the 
action  of  a force  that  depends  on  the  position  of  the  body  and  on  the  time  in 
the  absence  of  friction. 
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1.  If  the  right  member  cp(x,  t)  does  not  depend  on  x.  Then  (45) 

d2x  dx  C 

assumes  the  form  = rp(t),  whence  — = \q{t)  dt.  Integrating 

dt 2 di  J 

once  again,  we  get  x(t). 

2.  If  the  right  side  docs  not  depend  on  t,  that  is,  (45)  has  the 
form  ---  o(x).  In  this  case  we  set  v = — and  then 

dt 2 dt 

d2x  dv  dv  dx  ^ dv  1 d(v 2) 

dt 2 dt  dx  dt  dx  2 dx 


We  can  rewrite  the  equation  thus:  <p(#),  whence 


v 


v ~ 


dx 

dt 


dx 


which  is  an  equation  with  variables  separable. 

But  when  the  right  member  depends  both  on  x and  on  t , neither 
of  these  procedures  can  be  used,  so  the  equation  has  to  be  solved 

• ■.  « dx 

numerically.  First  put  — = v and  then  replace  (51)  by  the  first- 

dt 

order  system  of  equations 


dx 

dt 

dv 

dt 


==  ?(*,  t) 


(52) 


The  system  (52)  has  the  following  peculiarity:  the  right  side  of 
the  first  equation  of  the  system  does  not  involve  x and  the  right 
side  of  the  second  equation  does  not  contain  v.  This  permits  suggest- 
ing a very  convenient  computation  scheme. 

dx 

Suppose  we  have  the  initial  conditions  a;  = x0>  — — v ~ v0 

dt 

for  t = t0.  We  choose  a step  At,  that  is,  we  seek  the  values  of  the 
solutions  for  the  following  values  of  the  arguments:  tx  — t0  + At, 
h ~ + 2 A/,  t3  = t0  + 3>At,  and  so  forth. 

We  call  the  values  of  the  arguments  t0,  tv  t2>  t2,  ...  integral  and 
the  values  tl}2,  t^\2,  t2 >/a  ...  half-integral.  We  compute  the  values  of  v = 

dx  • 

= — for  half-integral  values  of  the  arguments,  and  the  values  of  # for 
dt 

integral  values  of  the  arguments.  The  sequence  of  operations  is  now 
as  follows.  Knowing  x0  and  v0,  we  find  vlf2  from  the  formula  vlt2  = 

= ^0+90*“*  where  <p0  = <?{x0,  t0).  Then  we  determine  xx  = 

= x0  + vii2‘  A*,  9i  = 9(XV  k)  an(i  Vi'u  = vn  2 + 9r  A/.  The  pro- 
cess is  then  repeated:  x2  = xx  + vvu- At,  ?2  = cp(x2,  t2),  v2./t  = 
= Vv/t  + 92  ' Atf,  and  so  on. 
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It  is  convenient  to  arrange  the  computations  in  a table. 


Table  6 


t 

X 

<?(*.  f) 

dx 

dt 

*0 

*0 

9o 

^0 

*1/  2 

h 

*1 

K3 

II 

*11/2 

*2 

X2 

= 9(^2,  U) 

*21/2 

V2U2 

*3 

r3 

?3 

*31/  2 

VSh 

*4 

X\ 

?4 

Thus,  when  passing,  say,  from  xx  to  x2,  we  use  the  value  of  the 
derivative  which  value  corresponds  to  the  midpoint  of  the  inter- 
val. As  a result,  the  accuracy  of  this  procedure  is  of  the  same 
order  as  that  of  a second  approximation  in  the  ordinary  method. 
(The  error  at  each  step  is  of  the  order  of  (A/)3.)  The  effort  expended 
is  the  same  as  when  using  a first  approximation,  but  we  have  higher 
accuracy  due  to  a more  reasonable  construction  of  the  computa- 
tional scheme. 

It  is  worth  noting  once  again  that  this  scheme  is  only  possible 
because  of  the  peculiarities  of  the  system  (52). 

Exercises 

1.  Set  up  a table  of  the  values  of  the  function  y = cx  in  a numerical 
solution  of  the  equation  yr  = y with  initial  condition  y = 1 
for  x ~ 0. 

Obtain  the  values  of  ez  for  # = 0.1,  0.2,  0.3,  and  so  on 
at  intervals  of  0.1  up  to  x = 1.  Carry  the  computations  via 
the  first  approximation  to  four  decimal  places.  Then  take  the  step 
A#  = 0.05  and  use  the  scheme  involving  recalculation.  Compare 
the  results  with  the  tabulated  figures. 

2.  Let  y(x)  be  a solution  of  the  equation  — = x2  -|-  y2  with  ini- 

dx 

tial  condition  r = 0.5  for  x = 0.  Find  y(0.5). 

Work  the  problem  in  two  ways: 

(a)  using  the  first  approximation  with  step  A#  = 0.025; 

(b)  using  the  second  approximation  with  step  A#  = 0.05. 
Compare  the  results.* 


The  differential  equations  of  Exercises  (2)  and  (5)  can  not  bo  solved  analyti- 
cally. 
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3.  Let  a body  of  mass  m be  in  motion  under  the  action  of  a force 
J(t)  — at(d  — t)  and  let  it  experience  a resistance  of  the  medium 
proportional  to  the  velocity  (with  proportionality  factor  k ). 
Let  the  velocity  at  the  initial  time  be  0.  Set  up  a differential 
equation  and  solve  it  numerically  for  the  case  m = 10  g,  a = 5 g- 
cm/s4,  0 = 20  s,  k = 1 g/'cm.  Determine  the  velocity  of  the 
body  L5  s after  the  start  of  motion. 

Hint . Take  advantage  of  the  recalculation  scheme,  taking  A t = 
= 0.05.  Carry  the  calculations  to  three  places. 

4.  Set  up  a table  of  the  values  of  the  function  y = sin  #,  solving 
the  equation  y " + y = 0 numerically  with  initial  conditions 

y = 0,  — = 1 at  x = 0.  Obtain  the  values  for  x — 0.1,  0.2, 

dx 

0.3,  etc.  at  0.1  intervals  up  to  x — 1.  Use  the  recalculation 
scheme  taking  the  step  A#  = 0.1.  Compare  the  results  with 
those  tabulated.  Carry  the  computations  to  three  decimal  places. 

5.  Set  up  a table  of  values  of  the  solution  of  the  equation  — = 

. . . dx 

= t + x2  with  initial  conditions  x = 0,  — =1  for  t = 0. 

dt 

Form  the  table  from  t = 0 to  t = 0.5.  Take  step  length  At  = 0.1 
and  compute  with  a scheme  involving  half-integral  values  of  the 
argument  to  three  decimal  places. 

8.8  Boundary-value  problems 

The  general  solution  of  an  nth.  order  differential  equation  has  n 
arbitrary  constants,  that  is,  it  possesses  n degrees  of  freedom  (see 
Sec.  4.8).  In  order  to  pick  a particular  solution  from  the  general 
solution  we  have,  up  to  now,  made  use  of  the  initial  conditions, 
according  to  which  the  desired  function  and  its  derivatives  are  spe- 
cified for  a single  value  of  the  argument.  This  is  quite  natural  if  the 
independent  variable  is  time,  that  is,  if  we  are  studying  the  deve- 
lopment of  some  process : the  initial  conditions  here  simply  serve  as 
a mathematical  notation  for  the  initial  state  of  the  process.  That  is 
where  the  terms  initial  conditions , initial-value  problem  come  from 
even  when  the  independent  variable  has  a quite  different  physical 
meaning.  But  there  are  also  problems  that  are  stated  differently: 
for  example,  problems  in  which  there  are  two  “key”  values  of  the 
independent  variable  having  equal  status  for  which  the  desired  func- 
tion is  given.  For  example,  when  considering  the  deviation  y(x) 
of  a string  fixed  at  the  endpoints  x = a and  x = b,  the  conditions 
imposed  on  the  desired  function  y(x)  are:  y(a)  — 0,  y(b)  = 0. There 
are  also  other  methods  for  finding  a particular  solution  from  the 
general  solution  that  are  encountered  in  practical  problems.  Com- 
mon to  all  these  methods  is  that  the  number  of  supplementary  equa- 
tions imposed  on  the  desired  solution  must  be  equal  to  the  number 
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of  degrees  of  freedom  in  the  general  solution  of  the  equation  at  hand, 
that  is,  to  say,  to  the  order  of  this  equation. 

We  will  consider  a solution  of  the  equation 

y"  + P(*)y'  + q(x)y  = /(*)  («  < * < b)  (53) 

with  the  accessory  conditions 

y{a)  = a1(  y(b)  = a2  (54) 

although  all  the  general  conclusions  we  obtained  hold  true  for  linear 
differential  equations  of  any  order  n under  linear  accessory  condi- 
tions of  any  kind.  The  conditions  of  type  (54)  that  are  imposed  at 
the  endpoints  of  the  interval  on  which  the  solution  is  constructed 
are  called  boundary  conditions , and  a problem  involving  the  solution 
of  a differential  equation  with  specified  boundary  conditions  is  said 
to  be  a boundary-value  problem. 

In  Ch.  7 (Secs.  7.2,  7.3,  7.5)  we  saw  that  the  general  solution  of 
the  nonhomogeneous  linear  equation  (53)  has  the  following  structure: 

3'  = Y(x)  + C,j'i(x)  + C2y2(x)  (55) 

Here,  Y(x)  is  a particular  solution  of  the  equation  (53),  yx  and  y2 
are  two  independent  solutions  of  the  corresponding  homogeneous 
equation,  and  Cx  and  C2  are  arbitrary  constants.  Substituting  (55) 
into  the  conditions  (54),  we  get  two  relations  for  finding  Cx  and  C2: 

Ci3'i(«)  + C2yz(a)  = a,  — Y(a)  1 

C\yi(b)  + C2y2(b)  = a2-  Y(b)  i 

Two  cases  (see  Sec.  8.3)  can  arise  in  the  solution  of  this  system 
of  two  algebraic  equations  of  the  first  degree  in  two  unknowns. 

1.  Basic  case:  the  determinant  of  the  system  is  different  from 
zero.  Here,  the  system  (56)  has  a very  definite  solution,  and  therefore 
the  equation  (53)  with  conditions  (54)  has  one  and  only  one  solu- 
tion for  any  nonhomogeneous  term  of  f(x)  and  for  any  numbers  oq,  a2. 

2.  Particular  case:  the  determinant  of  the  system  is  zero.  Here, 
the  system  (56)  is,  as  a rule,  inconsistent,  but  for  certain  right-hand 
members  it  has  an  infinitude  of  solutions.  Which  means  that  equa- 
tion (53)  for  conditions  (54)  and  for  an  arbitrary  choice  of  the  function 
f(x)  and  the  numbers  oq,  a2  also,  does  not  as  a rule  have  a single 
solution.  However,  for  certain  such  choices  the  problem  has  an  infinity 
of  solutions.  For  example,  it  can  be  verified  that  if  f(x)  and  oq  have 
already  been  chosen,  then  an  infinitude  of  solutions  results  only  for 
a single  value  of  a2  and  there  will  not  be  a single  solution  for  the 
remaining  values. 

As  to  which  case  occurs  depends  on  the  form  of  the  left-hand 
members  of  equation  (53)  and  conditions  (54),  this  fact  has  to  be 
stressed. 
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By  Sec.  8.3,  for  the  basic  case  to  occur  it  is  necessary  and  suffi- 
cient that  the  corresponding  homogeneous  problem  (in  which  f{x)  ==0, 
oq  = a2  = 0)  have  only  a trivial  solution.  In  the  particular  case,  the 
homogeneous  problem  has  an  infinity  of  solutions,  and  if  the  non- 
homogeneous  problem  has  at  least  one  solution,  then  the  general 
solution  is  obtained  if  to  this  particular  solution  we  add  the  general 
solution  of  the  corresponding  homogeneous  problem. 

In  solving  an  initial-value  problem  (that  is,  one  with  initial  con- 
ditions), we  always  have  to  do  with  the  basic  case,  since  such  a 
solution  always  exists  and  is  unique.  In  solving  a boundary-value 
problem  we  may  also  encounter  the  particular  case.  For  example,, 
consider  the  problem 

v"  + V = 0 (o  < x < |)  . y(0)  = a„  v(|)  = a2 

By  virtue  of  Sec.  7.3,  the  general  solution  of  the  equation  is  of  the  form 
y = Cx  cos  x + C2  sin  ^ (57) 

Substituting  the  boundary  conditions,  we  get 

Cj  = oq,  C2  = a2 

Hence,  for  arbitrary  oq,  a2,  we  get  a very  definite  solution 

y = oq  cos  x + a2  sin  x 

This  is  the  basic  case. 

For  the  problem 

y"  + y = 0 (0  < x ^ tu),  y( 0)  = oq,  y( t:)  = a2  • (58) 

substitution  of  the  boundary  conditions  into  the  same  general  solu- 
tion (57)  yields 

Cj  = oq,  — C1  = a2,  that  is,  C1  = — a2 


Thus,  if  oq  =£  — a2,  then  the  problem  (58)  does  not  have  a single 
solution.  But  if  oq  = — a2,  then  the  problem  has  the  solution 

V = oq  cos  x + Co  sin  x 

in  which  C2  is  quite  arbitrary,  which  is  to  say  that  we  have  an 
infinitude  of  solutions.  This  is  the  particular  case. 

Finally,  consider  a problem  with  the  parameter  = constant : 

y”  + \V  =/(*)  (°  < x < /),  y( 0)  = aa,  y(l)  = a2  (59) 


To  begin  with  we  assume  X > 0.  Then  the  independent  solutions  of 
the  corresponding  homogeneous  differential  equation  are  the  func- 


tions v1(x)=cos]/axi  y2{%)  — sin |/X  x,  and  the  determinant  of 
system  (56)  is 

:yi(o)  y2( o)!  1 l o ! . „1/Tf 


i 


i 
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Equating  it  to  zero,  we  get  the  values 

HiF'  (tMt)’-  (60) 

for  which  we  have  the  particular  case  for  problem  (59),  which  means 
that  either  the  existence  or  the  uniqueness  of  solution  is  upset. 

The  set  of  values  of  the  parameter  involved  in  the  statement 
of  a problem,  for  which  values  the  problem  degenerates  in  one  sense 
or  another  (which  is  to  say  that  it  loses  certain  very  essential  pro- 
perties and  assumes  qualitatively  different  properties),  is  called 
the  spectrum  of  the  problem.  We  leave  it  to  the  reader  to  verify  that 
for  X ^ 0 we  always  have  the  basic  case  for  problem  (59),  and  thus 
the  set  of  values  (60)  constitutes  the  entire  spectrum  of  the  problem. 

The  spectrum  (60)  of  problem  (59)  may  also  be  obtained  in  a 
somewhat  different  but  actually  equivalent  method.  As  has  been 
pointed  out,  the  particular  case  for  a boundary- value  problem  is 
characterized  by  the  fact  that  the  corresponding  homogeneous 
problem 

/'  + Xy  = 0f  y(0)  = 0,  y(l)  = 0 


can  have  a nontrivial  solution.  The  general  solution  of  this  differen- 
tial equation  is  of  the  form 

y = C1  cos  -f-  C2  sin  ]/ Xx  = C(D  cos  j/  lx  + sin  ]j  Xv) 


where  C = C2, 


c 

D = — . Using  the  boundary  conditions  yields 

c o 


C(D  cos  |/ X0  + sin  j/xO)  = 0,  C(D  cos  JOJ  -|-  sin  j hi)  — 0 

We  see  that  the  constant  C remains  arbitrary,  whereas  there  re- 
main two  equations  to  find  the  two  constants  D and  X: 

D cos  ]/X0  + sin  ]/X0  = 0,  D cos  + sin  ]j X/  = 0 


From  this  we  get 

D — 0,  sin  | !\l  = 0,  |/x/  = kr>  (k  — 1,  2,  ...) 

and  we  arrive  at  the  same  values  (60)  for  X. 

The  result  just  obtained  has  an  interesting  application  to  the 
investigation  of  the  stability  of  an  elastic  rod  in  the  case  of  com- 
pression. Let  a homogeneous  (which  means  being  the  same  through- 
out its  length)  elastic  rod  be  located  along  the  %-axis  and  let  it  be 
compressed  along  this  axis  by  a force  P (both  ends  of  the  rod 
are  held  on  the  *-axis,  but  can  be  free  to  rotate  about  the  points 
of  attachment  (Fig.  96a)).  Now,  when  the  force  attains  a certain 
critical  value,  Pcr,  the  rod  bends  and  takes  up  the  position  shown 
in  Fig.  9 6 b.  If  we  denote  by  y the  transverse  deviation  of  a point 
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Fig.  96 


of  the  rod  from  its  original  position,  then,  as  is  demonstrated  in 
strength  of  materials  courses,  the  function  y(x)  satisfies,  to  a suffi- 
cient degree  of  accuracy,  the  differential  equation  and  boundary 
conditions 

y”  + fy  = o,  y(o)=y(i)  = o (6i) 

■E'J 

Here,  E and  J are  the  so-called  modulus  of  elongation  of  the  material 
of  the  rod  and  its  moment  of  inertia , respectively. 

As  follows  from  (60),  when 


then  we  have  the  basic  case  for  the  problem  (62),  that  is,  it  has  only 
a trivial  solution:  no  bending  occurs.  As  soon  as,  with  the  increase 
of  P,  the  inequality  (62)  becomes  an  equality,  the  particular  case 
sets  in  and  problem  (61)  has,  in  addition  to  the  trivial  solution,  a 

solution  of  the  form  y — C sin  y where  C is  an  arbitrary  constant. 

But  then  the  rod  cannot  be  held  in  the  rectilinear  state  and  small 
external  forces  * can  lead  to  finite  deviations  from  this  state : the 
rod  loses  stability.  The  resulting  expression  for  Pcr, 


was  found  by  Euler  in  1757.  It  might  appear  that  for  P > Pcr  the 
rod  would  straighten  out,  but  this  is  not  so.  Equation  (61)  describes 
the  deviation  of  the  rod  only  in  the  limit  for  small  deviations, 
whereas  an  analysis  of  the  more  exact  equation  that  holds  true 
for  arbitrary  deviations  (it  turns  out  to  be  nonlinear)  shows  that  as 


We  have  in  view  external  forces  which  tend  to  deflect  the  rod  from  the  rectili- 
near state,  for  example,  a small  force  directed  perpendicularly  to  the  rod. 
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P passes  through  PCT)  in  addition  to  the  unstable  rectilinear  form 
of  equilibrium  there  appears  a curved  form  of  equilibrium,  which 
is  stable.  As  P increases,  the  curvature  of  this  form  rises  rapidly 
and  the  rod  is  destroyed. 

Green's  function  (Sec.  6.2)  can  be  applied  to  the  solution  of  a 
nonhomogeneous  equation  under  homogeneous  boundary  conditions, 

y"  + p{*)y'  + q(%)y  = /(*)  («  < * < b), 

y(a)  = 0,  y(b)  = 0 

in  the  basic  (nonparticular)  case,  since  it  is  clear  that  when  the  right 
members  are  added,  so  are  the  solutions.  In  accordance  with  Sec.  6.2, 
if  we  denote  by  G(x,  5)  the  solution  of  problem  (63),  in  which  we  take 
the  delta  function  8(#  — £)  instead  of  f(x),  then  for  an  arbitrary 
function  f(x)  the  solution  of  (63)  can  be  obtained  from  the  formula 

b 

y(x)=^f{l)G(x,l)dl  (64) 

a 

Here  is  a simple  example.  Suppose  we  have  the  problem 

y"  =/(*)  (0  < x < l),  y(0)  = y(l)  = 0 (65) 

If  instead  of  f(x)  we  put  $(#  ~ £),  then  for  0 ^ x < £ (and  £ < x ^ 

^ l we  simply  get  y"  — 0 or  the  solution 

y = ax  -f-  b (0  ^ x < £),  y = cx  + d (£  < x ^ /) 

where  a , b , cy  d are  some  kind  of  constants.  Applying  the  boundary 
conditions  shows  that  b = 0 and  cl  + d = 0,  or 

y = ax  (0  < x < £), 

y = c{x-l){Z<x^l)  (66) 

If  the  equation  y"  = S(x  — £)  is  integrated  from  x = £ — 0 

to  x = \ + 0,  then  we  find  that  y’(%  + 0)  — y'(£  — 0)  = 1.  Inci- 

dentally, for  the  left-hand  side  of  equation  (63)  we  would  have  the 
same  result  since  integration  of  a finite  function  over  an  interval 
of  length  zero  yields  zero.  A second  integration  of  the  delta  function 
yields  a continuous  function  so  that  y(£  — 0)  = y( S;  -f  0)  and  from 
(66)  we  get  c — a = 1,  aZ>  = c(%  — /),  whence 


Substituting  into  (66) , we  get  Green's  function  for  the  problem  (65) : 

(o  < * < 6), 

!LiAS.  R < * < j) 
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Fig.  97 


This  function  is  shown  in  Fig.  97.  It  will  be  seen  that  it  differs  from 
the  function  constructed  in  Sec.  6.2  only  by  the  constant  factor  — F. 
By  virtue  of  formula  (64)  we  get  the  solution  of  problem  (65)  for  any 
function  f(x) : 

l X i 

y = J G(x,  l)  f(l)  dl  = J G(x,  l)  f{l)  d\  + J G(x,  -)  ffc)  d\  = 

0 0 0 

* l 

= U(D  dl  - 7 5 (/  - Dm  « 

0 x 


Exercises 

1.  Find  the  spectrum  of  the  boundary-value  problem  y"  + Xy  =«=  0, 

3,(0)  = 0,  y'(l)  = 0. 

2.  Use  Green's  function  to  construct  a solution  of  the  problem 

y"  + y = /(*),  y(°)  = o,  y(n/2)  = o. 

8.9  Boundary  layer 

It  often  happens  that  the  differential  equation  being  studied  or  a 
system  of  such  equations  contains  one  or  several  parameters  that  can 
take  on  a variety  of  constant  values.  For  the  sake  of  simplicity,  let  us 
consider  the  first-order  equation 

(67) 

dx 

(where  X is]a  parameter)  for  definite  initial  conditions  x = x0,  y = y0. 
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Fig.  98 


We  assume  that  the  point  (x0,  y0)  is  not  singular  (Sec.  8.1), 
that  is,  for  the  given  conditions  there  is  a unique  solution  of  the  equa- 
tion (67).  Then  from  the  geometric  meaning  of  the  equation  (67) 
(Sec.  7.1)  it  follows  that  if  the  right  side  depends  on  X in  conti- 
nuous fashion,  then  the  direction  field  will  vary  but  slightly  for 
small  changes  in  X,  and  for  this  reason  the  solution  y(x,  X)  will  also 
depend  on  X in  continuous  fashion. 

However,  it  sometimes  happens  that  the  parameter  occurs  in 
the  differential  equation  in  a manner  such  that  for  certain  values 
of  the  parameter  the  equation  lowers  its  order  (degenerates).  Then 
new  circumstances  arise  which  we  illustrate  with  an  example. 

Consider  the  problem 

W + y = o,  y \x=o  = 1 (68) 

with  the  solution  y = e~xix.  The  equation  degenerates  when  X = 0 
(why  ?).  Suppose  the  solution  is  considered  for  # ^ 0 and  X ->  -f  0; 
this  solution  is  shown  in  Fig.  98.  The  equation  (68)  passes  into  y — 0 
in  the  limit,  but  we  see  that  for  small  X the  solution  is  close  to  zero 
not  from  x = 0 at  once  but  only  from  a certain  % = h . The  inter- 
val 0 < x < h,  which  is  called  the  boundary  layer,  serves  as  a tran- 
sition from  the  unit  initial  value  (68)  to  a value  close  to  zero.  The 
width  of  the  boundary  value  is  merely  conventional,  since  theoreti- 
cally the  solution  never  becomes  exactly  equal  to  zero.  If,  say, 
for  the  width  of  the  boundary  layer  we  take  the  value  x = h,  at 
which  the  solution  diminishes  £-fold  the  original  value,  then  we  get 

*-»/*  = i,  h=\ 
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which  means  that  for  the  problem  (68)  the  width  of  the  boundary 
layer  is  simply  equal  to  X. 

This  is  the  profile  that  the  velocity  of  a fluid  has  in  the  sliding 
motion  of  a lamina  in  a fluid  at  rest.  Here,  # is  the  distance  from 
the  lamina,  reckoned  along  the  normal  to  it ; the  velocity  of  the  fluid 
is  laid  off  on  the  y-axis,  and  the  parameter  X is  proportional  to  the 
viscosity  of  the  fluid.  It  turns  out  that  for  the  equations  of  motion  of 
a viscous  fluid  (this  is  a system  of  partial  differential  equations)  the 
coefficient  of  viscosity  serves  as  a factor  of  the  highest  derivative; 
in  other  wx>rds,  for  these  equations  we  have  the  same  situation  as 
for  the  model  problem  (68).  In  the  case  of  viscosity,  the  fluid  adheres 
to  the  lamina,  the  layer  of  fluid  entrained  by  the  lamina  being  the 
narrower,  the  lower  the  viscosity.  Incidentally,  this  is  clear  from 
physical  reasoning.  In  the  limit,  when  the  viscosity  is  zero  (this  is 
called  an  ideal  fluid),  the  lamina  slides  without  entraining  any  fluid, 
and  the  velocity  of  the  fluid  is  equal  to  zero  right  down  to  the 
very  surface  of  the  lamina. 

If  X ->  —0,  then  the  resulting  solution  depicted  in  Fig.  98  by 
the  dashed  line  tends  to  infinity  for  any  x > 0.  This  case  is  of  less 
interest. 

Exercise 

Consider  the  behaviour  of  the  solution  of  the  problem 
Xy"  — y = 1,  y(—  1)  = y(  1)  = 0 as  X + 0. 

8.10  Similarity  of  phenomena 

Two  or  several  phenomena  are  said  to  be  similar  (in  physics, 
chemistry,  engineering,  sociology,  etc.)  if  they  differ  in  scale  alone. 
For  example,  Fig.  99  shows  similar  processes  of  current  variation 
in  a circuit:  in  both  cases  the  current  builds  up  from  zero  over  an 
interval  of  time,  then  remains  constant,  after  which  it  drops  suddenly 
to  zero.  Thus,  if  for  the  characteristic  time  /Ch  (which  in  this  process 
is  the  standard  for  comparison  purposes)  we  take  the  time  of  current 
buildup,  and  for  the  characteristic  current  jc h,  the  maximum  current, 
and  if  we  reckon  the  time  from  the  onset  of  buildup  and  then  denote 

t = tcht,  j = jchf 

where  t and  j are  nondimensional  time  and  current,  then  the  relation- 
ship j(t)  shown  in  Fig.  100  will  be  the  same  for  both  processes. 

Note  that  the  graphs  in  Fig.  99  are  not  similar  in  a geometric 
sense  since  the  angles  change  when  passing  from  one  graph  to  the  other 
and  the  lengths  are  not  proportional.  But  if  quantities  with  different 
dimensions  are  laid  off  on  the  axes,  we  are  never  interested  in  the 
angles  and  in  the  segments  that  are  not  parallel  to  the  coordinate 
axes.  For  example,  in  Fig.  99  the  distance  of  a point  from  the  origin 
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Fig.  100 


is  equal  to  j ft2  + j2  by  the  rules  of  analytic  geometry,  but  this  ex- 
pression is  meaningless  from  the  point  of  view  of  dimensions  and  for 
this  reason  will  never  be  encountered. 

If  the  relations  j(t)  are  similar,  then  so  also  are  other  relationships, 

t 

such,  say,  as^j{t)dtt  which  is  the  dependence  of  the  magnitude  of 

the  charge  flow  on  the  time,  or  Rj2  the  dependence  of  the  electric 
power  on  the  time,  and  so  on.  But  imagine  that  there  is  a device 
in  the  circuit  that  is  turned  on  when  the  current  reaches  a definite 
value  j0 ; the  operation  of  such  a device  is  described  by  the  function 
e(j  — yo)>  where  e is  a unit  function  (Sec.  6.3).  It  is  clear  that,  gene- 
rally, there  will  be  no  similarity  with  respect  to  this  device.  Hence 
one  speaks  of  similarity  relative  to  certain  characteristics  and  not 
in  general. 

How  can  we  find  out  if  two  phenomena  arc  similar  or  not? 
If  the  characteristics  relative  to  which  the  similarity  is  considered 
are  obtained  from  a solution  of  certain  equations,  then  see  if  linear 
transformations  can  be  performed  with  the  quantities  in  these  equa- 
tions (i.e.  substitutions  of  the  type  x axx  + bXf  which  signify  changes 
in  the  scale  and  the  reference  point)  so  that  the  equations  for  both 
cases  become  the  same.  Of  course  if  one  is  dealing  with  differential 
equations,  then  after  the  transformation  the  initial  conditions  (which 
are  also  needed  for  a solution)  must  likewise  coincide.  Here  is  an 
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equivalent  procedure:  if  the  equations  of  two  phenomena  can  be 
reduced  via  linear  transformations  to  one  and  the  same  “standard” 
form,  then  the  phenomena  are  similar. 

Here  is  an  example.  Suppose  we  are  considering  forced  oscilla- 
tions of  a linear  oscillator  without  friction  (Sec.  7.5)  defined  by 
the  equation 

m + kx  = A sin  ot  (69) 

dt2  v ' 


and  by  initial  conditions 


i*=o 


dx 

dt 


(70) 


There  are  five  parameters  in  this  problem:  m , k,  A,  go,  x0.  To  deter- 
mine when  the  oscillations  will  be  similar  for  various  combinations 
of  these  parameters,  take  //go  and  x0  =£  0 for  the  characteristic  time 

and  length  and  denote  t = — t,  x ~ x0x.  After  passing  to  the  non- 

GO 

dimensional  variables  t,  x , and  performing  simple  transformations, 
we  get  the  differential  equation  and  initial  conditions  in  the  form 

drx  k ~~  A . y H . dx  A 

— — d % = sin  t,  x\  = 1,  ^ =0  (/l) 

dt'1  ' tn<&2  xQmo*2  |7t=0  dt  T=o 


Thus,  if  for  one  oscillator  the  values 

A = — t,  h = (72) 

.r0mw" 

are  the  same  as  those  of  the  other,  then  the  standard  problems  (71) 
are  the  same  for  them,  which  means  that  these  oscillations  are  si- 
milar. It  is  easy  to  verify  that  the  quantities  I1  and  /2  are  nondi- 
mensional.  (By  applying  the  results  of  Sec.  7.5  it  is  easy  to  obtain 
the  formulas  Ix  = cojj/co-,  I2  — Xfreelx0)  where  co0  is  the  natural  fre- 
quency of  the  oscillator  and  xiree  is  the  amplitude  of  oscillations  of 
the  free  mass  m under  the  action  of  a force  A sin  <*t.)  Such  nondi- 
mensional  quantities  whose  coincidence  ensures  the  similarity  of 
phenomena  are  called  similarity  criteria . In  the  problem  at  hand 
there  are  two  criteria  (72),  of  which  only  Ix  is  determined  by  the 
parameters  of  the  oscillator,  whereas  the  coincidence  of  I2  can  be 
ensured  through  a choice  of  the  initial  conditions.  (The  case  x0  = 0 
was  not  included  in  the  foregoing  reasoning.  We  leave  it  to 
the  reader  to  see  that  in  this  case  the  sole  similarity  criterion 
will  be  Iv) 

The  solution  of  the  standard  problem  (71)  has  the  form  x — 
= ? where  the  specific  form  of  the  function  is  readily 
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obtainable  by  the  method  of  Sec.  7.5.  Returning  to  the  original 
variables,  we  get  the  following  oscillation  law: 

x = tot ; — — » — - — \ 

t wco2  x0m<A2  J 

In  many  cases  the  similarity  criterion  can  be  established  directly 
from  the  dimensions.  In  the  problem  discussed  above,  we  take  for 
the  basic  dimensions  the  mass  Me,  the  time  T and  the  length  L. 
Then  the  dimensions  of  the  parameters  of  the  problem  will  be  [m]  = 
= M,  [k]  = MT~2t  [A]  = MT~2L,  [co]  = T~\  [*0]  = L (verify  this  !). 
Using  these  parameters,  we  can  form  only  one  dimensionless  combina- 
tion not  containing  the  initial  data.  This  is  Iv  (Of  course,  the  quan- 
tities if,  if V*  and  so  on  will  also  be  nondimensional,  but  they  do  not 
yield  any  new  and  independent  similarity  criteria.)  Another  nondi- 
mensional combination  that  contains  the  initial  data  is  i2.  It  is  easy 
to  see  — we  leave  this  to  the  reader  — that  any  nondimensional 
combination  of  the  type  makbA€ todxe0  can  be  represented  in  the 
ollowing  manner : ifi|,  that  is,  in  this  problem  there  are  no  similarity 
riteria  independent  of  I1  and  /2. 

Let  us  consider  another  example.  Suppose  a ball  of  mass  m 
is  suspended  by  a string  of  length  l and  is  oscillating  with  frequency  v 
and  maximum  angle  a of  deviation  from  the  vertical ; air  resistance, 
the  mass  of  the  string,  and  other  complicating  factors  are  ignored. 
It  is  clear  that  there  is  one  relation  between  the  indicated  parameters 
and  the  acceleration  of  gravity  g : for  example,  we  can  arbitrarily 
specify  m , /,  g and  a,  and  then  v is  determined  uniquely.  The  di- 
mensions of  the  parameters  are:  [tri\  = M , [Z]  = L,  [v]  = T_1,[a]  ~ 1, 
[g]  = LT~2.  Here  the  complete  system  of  nondimensional  criteria  is 
S = Zv2g-!  and  a;  and  since  there  must  be  a relationship  connecting 
hem,  S must  depend  on  a alone,  whence 

S'/‘  + =/(«),  that  is,  V = |/f  /(«)  (73) 

The  function  /(a)  may  be  obtained  either  numerically,  by  inte- 
grating the  appropriate  differential  equation,  or  experimentally,  by 
measuring  v for  specified  values  of  the  remaining  parameters.  Since 
5 is  expressed  in  terms  of  a,  it  follows  that  a in  this  problem  is  the 
sole  criterion  of  similarity. 

We  thus  see  that  simple  dimensional  reasoning  enabled  us  to 
obtain  important  information  concerning  oscillations.  These  are  of 
course  only  the  simplest  examples  of  the  application  of  the  theory 
of  similarity  and  dimensionality,  which  is  today  widely  used  in  va- 
eious  branches  of  physics.  The  theory  and  its  applications  are  describ- 
ed very  well  in  the  following  books:  [2],  [5],  [14].  Beginning  with  a 
rertain  level,  the  mathematical  theory  is  closely  tied  up  with  physics 
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and  is  being  developed  in  various  spheres  of  physics  in  different 
ways. 

Simulation  {model-building)  is  based  on  the  similarity  of  pheno- 
mena. Here  the  behaviour  of  an  object  is  studied  on  a model  similar 
to  the  original  in  the  sense  described  above.  For  example,  to  find 
out  what  the  frequency  will  be  of  a pendulum  on  the  moon  for  a 
given  a,  we  can  measure  the  frequency  for  that  same  a under  terres- 
trial conditions,  and  then  recalculate  by  the  formula 


that  follows  from  the  relation  (73). 

ANSWERS  AND  SOLUTIONS 


Sec.  8.1 

Saddle  point,  nodal  point,  vortex  point. 


Sec.  8.2 

yy*  + zz'  = y2  + z2,  + ^ = 2 dx,  whence  y2  + z2  = Ce2x. 

y 2 4-  *2 

From  this  we  see  that  for  x ->oo  all  the  particular  solutions 
(except  the  trivial  solution)  become  infinite,  whereas  they 
tend  to  zero  as  x — oo. 


Sec.  8.3 

1.  It  a ^ 6,  the  system  has  exactly  one  solution.  For  a = 6,  b =£  9, 
the  system  is  inconsistent.  For  a = 6,  b ~ 9 the  system  has  an 

infinity  of  solutions:  x = t,  y = - — - (for  arbitrary  /). 


2.  px  = 1,  \ = 1,  fxj  = 3,  p2 


6,  X« 


1. 


= — 2,  # = Cxel  + C2e6\  y = 3 Gxel  — 2 C2e&t. 


Sec.  8.4 

At  the  point  (0,  0)  the  equilibrium  is  stable ; at  the  points  (1,  — 1) 
and  (—  1,  1)  it  is  unstable. 


Sec.  8.5 


1-  ToW  = l.  yi(x)  = 1 + x,  y2(x)  = 1 + X + -, 


a/2  jy  3 

tt(*)  = 1 + X+^+2~’  "• 


In  the  limit  we  obtain  the  exact 


solution  y = e*  expanded  in  a series  of  powers  of  x. 


2.  y — x + — + — + ... 

O O 


.3 . y = 


x2(6  - 8x  + 3x2) 


« + 


1 - * 


12(1  - x)2 
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Sec.  8.7 

1.  The  following  is  a tabulation  of  the  computations  via  the  first 
approximation,  the  second  approximation,  and  the  exact  values: 


X 

y first 

^second 

y exact 

0.0 

1.0000 

1.0000 

1.0000 

0.1 

1.1000 

1.1051 

1.1052 

0.2 

1.2100 

1.2212 

1.2214 

0.3 

1.3310 

1.3496 

1.3499 

0.4 

1.4641 

1.4915 

1.4918 

0.5 

1.6105 

1.6483 

1.6487 

0.6 

1.7716 

1.8216 

1.8221 

0.7 

1.9488 

2.0131 

2.0138 

0.8 

2.1437 

2.2248 

2.2255 

0.9 

2.3581 

2.4587 

2.4596 

1.0 

2.5939 

2.7172 

2.7183 

2.  By  the  first  method,  y(0.5)  = 0.7081 ; by  the  second,  0.7161. 

3.  The  differential  equation  has  the  form  — = 0.5  t (20  — t)  — 

dt 

— 0.2  v.  The  velocity  1.5  seconds  after  the  start  of  motion  is 
9.682  cm/s. 

4.  The  computed  values  and  exact  values  are: 


X 

0.0 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

0.9 

1.0 

yapprox 

0.000 

0.100 

0.199 

0.296 

0.390 

0.480 

0.565 

0.645 

0.718 

0.784 

0.842 

^exact 

0.000 

0.100 

0.199 

0.296 

0.389 

0.479 

0.565 

0.644 

0.717 

o 

00 

0.842 

5.  The  following  is  a table  of  the  values  of  the  solution : 


t 

0.0 

0.1 

0.2 

0.3 

i 

0.4 

0.5 

X 

0.000 

0.100 

0.201 

0.305 

0.412 

0.525 

Sec.  8.8 
1.  Let 


X > 0.  Then  yx(x)  = cos  ]/Xx,  y2(x)  = sin  and  the 
determinant  of  the  system  similar  to  (56)  is 


*(0) 

ym 


3'2(0)|  _ 

ym  I 


= y x cos  y u 


1 0 
I — yxsmyx/  yxcosyx/ 

Whence  the  spectrum  is  determined  from  the  equation  cos  ]fx  l = 
= 0,  that 


is,  yx/=  — J + £~,  X = 


41*  V ' 
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If  X < 0,  then  j'^#)  = *,  y2{x)  = e ^1*1  *;  the  determinant  is 


equal  to 


= — 1/j  X | (eV\M*+e-V Ml)< 


1 1 

]/\T\eV^‘  — ypTj  e-m-> 

< 0,  that  is,  it  does  not  vanish.  For  X = 0 we  have  y1  = 1, 

= 1. 


v2  = x;  the  determinant  is  equal  to 
2.  In  this  problem,  G(x,  !;)  = 


— sin  £ cos  x |i;  ^ x ^ j 
whence  the  solution  of  the  problem  is  of  the  form 


1 0| 

0 1 , 

— cos!;  sin  % (0  ^ ^ £), 


y 


X 4 

-cos  x ^ sin  \ /(?)  d\  — sin  x ^ cos  £/(!;)  d\ 


Sec.  8.9 

The  solution  has  the  form 


en  + c tr 

As  X~>  -f  0 it  tends  to  the  solution  of  the  degenerate  equation, 
that  is,  to  y — — 1 for  all  x between  — 1 and  1.  Near  both 
endpoints  there  arises  a boundary  layer  whose  width  is  asymp- 
totically equal  to  )/X  as  X->0. 


Chapter  9 
VECTORS 


In  physics  we  often  have  to  do  with  vec- 
tors, which  are  quantities  endowed  with  nu- 
merical value  and  also  direction.  Examples 
are:  a line  segment  connecting  the  origin  of 
coordinates  with  a given  point,  the  velocity 
of  a moving  material  point,  the  force  acting 
on  a body. 

If  a body  is  in  motion  along  a definite 
line,  say  a straight-line  railway  track,  the  po- 
sition of  the  body  can  be  determined  by  the 
distance  from  a specified  point  on  the  line 
measured  along  this  line.  Motion  on  such  a line 
is  possible  only  in  two  directions,  which  can  be 
distinguished  by  affixing  a plus  sign  to  one  direction  and  a minus 
sign  to  the  other. 

If  a body  is  known  to  be  moving  in  a plane  (or  in  space), 
then  we  cannot  indicate  the  position  of  the  body  at  any  time  if 
only  its  distance  is  given  from  a specified  point;  we  also  have  to 
specify  the  direction  of  the  line  connecting  the  body  with  that  point 
(coordinate  origin).  In  the  same  way,  when  specifying  velocity  we 
have  to  indicate  the  magnitude  and  the  direction.  Quantities  endowed 
with  direction  are  termed  vectors . We  will  indicate  them  by  boldface 
type  or  by  an  arrow  over  the  letter.  In  contrast  to  vectors,  quanti- 
ties that  do  not  have  direction  and  are  completely  determined  by 
their  numerical  value  in  a chosen  system  of  units  are  called  scalars . 
Examples  are:  the  mass  of  a body,  its  energy,  the  temperature  of 
a body  at  a given  point.  The  term  “scalar”  was  not  needed  as  long 
as  we  got  along  without  the  word  “vector”. 

Vectors  can  be  treated  in  three-dimensional  space  or  on  a plane 
(in  two-dimensional  space). 

If  we  disregard  the  direction  of  a vector  quantity,  we  can 
deal  with  the  absolute  value  (modulus)  of  the  vector.  The  modulus 
is  a positive  scalar  having  the  dimensions  of  the  given  quantity. 
For  example,  for  a vector  F of  5 newtons  force  having  a definite 
direction,  the  modulus  (denoted  by  | F | or  F)  is  5 newtons ; | F | = 
5 newtons. 

Two  vectors  are  taken  to  be  equal  if  they  have  the  same  moduli,  are 
parallel  and  in  the  same  direction.  This  means  that  every  vector  can. 
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without  alteration,  be  translated  parallel  to  itself  to  any  spot, 
which  means  the  origin  of  such  a vector  can  have  any  location. 
To  specify  a vector  means  to  specify  its  modulus  and  direction. 

Geometrically,  a vector  is  depicted  by  a line  segment  whose 
direction  is  indicated  by  an  arrow.  This  of  course  requires  a scale 
(for  instance,  a force  of  1 newton  can  be  represented  by  a line 
segment  of  3 cm  length,  and  so  forth).  Only  a vector  quantity  having 
the  dimensions  of  length  can  be  represented  without  this  condition, 
that  is  to  say,  on  a 1-to-l  scale.  For  this  reason,  we  can  say,  for 
example,  that  a vector  of  translation  laid  off  from  point  A in  space 
will  reach  point  B , whereas  for  a vector  of  force  such  an  assertion 
is  meaningless. 

9.1  Linear  operations  on  vectors 

Linear  operations  involving  vectors  include  addition  (and  the  asso- 
ciated subtraction)  and  multiplication  of  the  vector  by  a scalar.  (Adding 
a vector  and  a scalar  is  just  as  absurd  as  trying  to  add  seconds  and 
centimetres.)  A quantity  is  characterized  as  a vector  only  if  these 
operations  are  performed  in  accord  with  the  rules  given  below. 

Addition  of  two  vectors  obeys  the  familiar  parallelogram  rule 
of  school  physics  (see  Fig.  101).  To  find  the  sum  here  it  is  sufficient 
to  construct  only  one  of  the  two  triangles  shown  in  Fig.  101.  From 
this  it  is  easy  to  obtain  a rule  for  the  addition  of  several  vectors: 
in  Fig.  102  we  have 

OB  = a + b,  OC  = OB  + c = a + b + c, 

OB  = OC  + d = a + b + c + d 

(In  the  three-dimensional  case  the  vectors  a,  b,  c,  d do  not  neces- 
sarily need  to  lie  in  a single  plane.)  Thus  the  sum  of  a number  of 
vectors  is  represented  by  the  line  segment  that  closes  the  polygonal 
line  whose  segments  are  the  vector  summands;  the  direction  of  the 
closing  vector  is  from  the  beginning  of  the  first  summand  of  the 
vector  to  the  end  of  the  last  summand. 

If,  referring  to  Fig.  102,  we  reverse  the  direction  of  the  vector 

OD , the  conclusion  we  come  to  is  particularly  interesting.  If  the 
vectors  form  a closed  polygon  (each  vector  is  applied  to  the  end 

of  the  preceding  one  and  the  end  of  the  last  one  coincides  with 

the  origin  of  the  first),  then  the  sum  of  all  these  vectors  is  equal 
to  the  zero  (null)  vector  0,  which  is  a vector  whose  terminus  coin- 
cides with  its  origin.  The  modulus  of  a null  vector  is  zero  and  the 
direction  is  undefined. 

Note  that  vectors  cannot  be  connected  by  an  inequality  sign; 
in  particular,  there  are  no  positive  and  negative  vectors. 

Multiplication  of  a vector  by  a scalar  (say,  by  an  ordinary 
number)  is  a natural  generalization  of  addition  and  subtraction  of 


314 


Vectors 


CH.  9 


Fig.  101 


Fig.  102 


vectors.  The  vector  3F  stands  for  the  sum  F + F + F ; from  the 
construction  it  is  clear  that  this  vector  had  the  direction  of  Fbut 
is  three  times  as  long.  The  vector  (—  1)  F = — F is  understood  to 
be  a vector,  which,  when  added  to  F,  yields  0.  It  is  clear  that 
this  vector  is  in  the  reverse  direction  to  that  of  F and  has  the 
same  modulus  as  F.  Generalizing  these  definitions,  we  say  that  XF 
(where  X is  any  scalar)  is  a vector  whose  modulus  is  equal  to 
| X | | F |,  and  XF  is  parallel  to  F and  in  the  same  direction  if  X > 0 
and  in  the  opposite  direction  if  X < 0.  (For  X < 0 we  say  that  the 
vectors  XF  and  F are  antiparallel.) 

Linear  operations  involving  vectors  abide  by  all  the  ordinary 
rules  of  elementary  mathematics.  For  instance,  a term  a can  be 
transposed  to  the  other  side  of  an  equation  to  become  — a,  both 
sides  of  a vector  equation  can  be  multiplied  or  divided  by  one  and 
the  same  scalar  by  the  ordinary  rules,  etc. 

Any  expression  of  the  form  Xa  + pib  + ...  +£d,  where  X,  [x, 
are  scalars,  is  called  a linear  combination  of  the  vectors  a,  b,  ...,  d. 
The  vectors  thus  specified  are  said  to  be  linearly  dependent  if  any 
one  of  them  is  a linear  combination  of  the  others;  otherwise  these 
vectors  are  said  to  be  linearly  independent  (among  themselves). 
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The  linear  dependence  of  two  vectors  means  that  they  are  parallel 
(think  this  over!).  If  any  two  nonparallel  vectors  a and  b have  been 
chosen  in  a plane,  then  any  third  vector  c in  that  plane  can  be 
“resolved  into  the  vectors  a and  b”,  which  is  to  say,  it  can  be 
represented  in  the  form  of  a linear  combination  (Fig.  103): 

c = Xa  + fib  (1) 

It  is  therefore  possible,  in  a plane,  to  indicate  two  linearly  independent 
vectors,  but  any  three  vectors  are  then  linearly  dependent.  The  expan- 
sion (1)  is  frequently  encountered  in  mechanics  and  other  fields  (the 
resolution  of  a force  along  two  directions,  and  the  like),  each  one  of 
the  terms  Xa  and  [xb  being  called  a component  of  the  vector  c,  and  the 
two  vectors  a,  b being  termed  the  basis. 

In  the  same  way,  in  space  we  can  indicate  three  linearly  inde- 
pendent vectors  (any  three  vectors  not  parallel  to  a single  plane). 
They  can  be  taken  as  a basis,  which  means  that  any  fourth  vector  can 
be  resolved  into  these  three,  and  so  any  four  vectors  in  space  are  linear- 
ly dependent.  The  difference  lies  in  the  fact  that  a plane  is  two- 
dimensional  and  ordinary  space  is  three-dimensional.  If  we  introduce 
the  concept  of  a vector  into  w-dimensional  space  (Sec.  4.8),  then 
the  basis  there  will  consist  of  n vectors  (see  Sec.  9.6). 

The  most  widely  used  bases  are  those  consisting  of  unit  vectors 
(i.e.  with  nondimensional  modulus  equal  to  1)  that  are  mutually 
perpendicular.  Such  bases  are  termed  Cartesian  or  Euclidean.  Vectors 
used  to  form  a Cartesian  basis  are  ordinarily  denoted  by  i,  j (in  the 
plane)  and  i,  j,  k (in  space).  Thus,  by  analogy  with  (1),  we  can  write 
the  resolution  of  any  vector: 

a = axi  + aj 

in  the  plane  and 

a = ax\  + aj  + a2k 

in  space.  Here,  ax,  ay,  az  are  the  coefficients  of  the  expansion. 
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Fig.  104 
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It  is  easy  to  determine  the  geometric  significance  of  these  coeffi- 
cients which  are  called  the  Cartesian  coefficients  of  the  vector  a ; for  the 
sake  of  definiteness  we  will  speak  of  vectors  in  the  plane,  since  the 
results  are  quite  analogous  for  space.  In  the  plane  choose  a point  0 , 
called  the  origin  of  coordinates , and  through  it  draw  axes  parallel 
to  the  vectors  of  the  chosen  basis.  Denote  these  axes  by  the  letters 
x and  y (Fig.  104).  We  thus  obtain  a Cartesian  system  of  coordinates 
in  which  the  position  of  each  point  A in  the  plane  is  defined  by  the 

coordinates  (xA,  yA).  From  Fig.  104  we  see  that  for  the  vector  a = AB 
we  have 


*x  = *b  — xa>  ay  = yB-  yA 

The  difference  xB  — xA  is  also  called  the  projection  of  the  vector  AB 

on  the  x-axis.'  Generally,  the  projection  pr2  a of  vector  a = AB  on 
some  axis  l (that  is,  on  a straight  line  on  which  is  indicated  which 
of  the  two  directions  is  taken  to  be  positive)  is  called  the  modulus 

(absolute  value)  of  the  vector  A'B ' defined  by  the  feet  of  the  perpen- 
diculars (Fig.  105)  dropped  from  points  A and  B to  the  axis  l\  this 
modulus  is  taken  with  the  plus  sign  or  the  minus  sign  depending  on 

whether  the  vector  A'B'  goes  along  the  positive  or  negative  l- axis. 
The  projection  of  one  vector  on  another  is  defined  in  similar  fashion ; 
in  this  case,  perpendiculars  are  dropped  to  the  other  vector  or  to  its 
prolongation. 
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Thus,  the  Cartesian  coordinates  of  a vector  are  its  projections  on 
the  basis  vectors: 

*Z  = Pr*a  = W&>  av  = PLa  = PrJa>  az  = PLa  = Prka 

We  stress  the  fact  that  the  projection  of  a vector  is  a scalar. 
Its  physical  dimensions  are  those  of  the  vector  being  projected.  From 
Fig.  106  there  follows  immediately  a simple  formula  for  computing 
projections: 


pr,  a = j a | cos  a = | a | cos  (a,  /) 


(2) 
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From  this  follows,  in  particular,  the  widely  used  formula  for  the  Car- 
tesian resolution  of  the  unit  vector  e : 

e = exi  + ej  + ezk  = cos  (£x)  i + cos  (eTy)  j + cos  (<C*)  k 

The  use  of  Cartesian  projections  makes  possible  the  use  of  formulas 
and  computations  instead  of  geometric  constructions,  and  ordinarily 
this  turns  out  to  be  simpler.  By  way  of  an  illustration,  we  obtain  the 
condition  of  the  parallelism  of  two  vectors  specified  by  their  resolu- 
tions: 

F=FJ  + FJ+F,k,  G = Gxi  + GJ  + GM 

This  condition  can  be  written  as  a vector  equation: 

G = XF 

where  X is  a scalar,  or  in  terms  of  the  projections  onto  the  coordinate 
axes: 

Gx  ~ ^Fx.y  Gy  — &z  = ^ z 

Solving  for  X in  each  equation  and  then  equating  the  results,  we  get 
the  required  condition: 


This  condition  consists  of  two  equations,  but  since  a vector  is  deter- 
mined by  three,  then  in  choosing  a vector  parallel  to  the  given  one, 
there  still  remains  one  degree  of  freedom,  the  modulus.  (Derive  these 
equations  geometrically  cn  the  basis  of  the  properties  of  similarity 
of  triangles.) 

To  conclude,  let  it  be  emphasized  that  the  vector  a is  completely 
described  by  its  Cartesian  projections  ax , ay,  az  only  when  the  Cartesian 
basis  i,  j,  k is  fixed.  If  this  basis  is  chosen  in  a different  manner,  that 
is,  if  the  coordinate  axes  are  rotated,  then  the  same  vector  will  have 
other  projections  (coordinates).  We  will  discuss  this  in  more  detail 
in  Sec.  9.5. 

Exercises 

1.  Let  OA  = a,  OB  = b.  Find  the  vector  OC,  where  C is  the  midpoint 
of  the  line  segment  AB. 

2.  Under  the  same  conditions  as  in  Exercise  1,  find  the  vector  ODt 
where  D bisects  AB  in  the  ratio  X:  1. 

3.  A vector  has  origin  at  the  point  A( 2,  3)  and  terminus  at  the  point 
B(—  1,  4).  Resolve  this  vector  in  terms  of  the  unit  vectors  of  the 
coordinate  axes. 
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9.2  The  scalar  product  of  vectors 

The  scalar  product  (also  sometimes  called  dot  product  or  inner 
product)  of  two  vectors  is  a scalar  equal  to  the  product  of  the  modulus 
of  one  vector  by  the  projection  onto  it  of  the  other  vector.  The  scalar 
product  of  the  vectors  a and  b is  denoted  by  a - b or  (a,  b).  Thus, 
a • b = | a | prab  and  if  we  take  advantage  of  formula  (2)  for  comput- 
ing the  projection,  we  get 

a • b = | a | prab  = | a [ | b | cos  (a,  b)  (3) 

From  this  it  is  immediately  apparent  that  a scalar  product  does 
not  depend  on  the  order  of  the  factors : a • b = b • a.  Besides,  it  is 
evident  that  for  nonzero  vectors  a and  b the  scalar  product  a • b is 
positive,  negative  or  zero  according  as  the  angle  between  the  vec- 
tors a and  b is  acute,  obtuse,  or  a right  angle.  A particularly  impor- 
tant equation  to  remember  is 

a*  b = 0 

as  the  necessary  and  sufficient  condition  for  two  vectors  a and  b to 
be  perpendicular.  A special  case  is  the  scalar  product  of  a vector  into 
itself  (the  scalar  square  of  the  vector) : 

a • a = | a j [ a | cos  0 — | a [2 

An  important  example  of  a scalar  product  is  the  work  performed 
by  a force  F over  a rectilinear  path  S.  If  the  body  is  moving  in  a 
straight  line  and  the  force  is  directed  along  this  line,  the  work  is  equal 
to  the  product  of  the  force  by  the  path  length.  If  the  direction  of  the 
force  does  not  coincide  with  the  direction  of  motion,  then  the  work 
is  equal  to  the  product  of  the  path  into  the  component  of  the  force 
acting  in  the  direction  of  motion,  that  is, 

A — |S|F5=|S||F|  cos  0 = S • F 

where  0 is  the  angle  between  the  directions  of  force  and  motion.  Thus, 
the  work  is  equal  to  the  scalar  product  of  the  force  by  the  path  length. 
The  special  case  of  a force  acting  in  the  direction  of  motion  is  embraced 
by  this  formula.  Here,  if  the  direction  of  the  force  coincides  with  that 
of  the  motion,  then  0 = 0,  cos  0=1,  A = |F||S|  and  the  work  is 
positive.  But  if  the  force  acts  in  the  reverse  direction  to  the  motion, 
then  0 = 77,  cos  0=  — l,  A = — |F||S|,  and  the  work  is  negative. 

Also  note  the  following  properties  of  a scalar  product: 

(Xa)  • b = X(a  • b),  (a  + b)  • c = a * c + b • c 

We  will  not  dwell  on  the  simple  proof  of  these  properties,  which  stems 
directly  from  the  definition  of  a scalar  product.  The  reader  can  carry 
out  the  proof  himself.  These  properties  permit  using  the  ordinary 
rules  of  algebra  when  dealing  with  scalar  products,  but  one  should 
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remember  that  only  two  vectors  can  be  multiplied  together  to  form  a 
scalar  product. 

Use  is  also  made  of  the  following  formulas,  which  the  reader  can 
verify  for  himself:  if  e,  e',  e"  are  unit  vectors,  then 

/\ 

a - e ==  prea,  e'  • e"  = cos(e',  e") 

Let  us  now  derive  an  important  formula  that  permits  com- 
puting a scalar  product  if  we  know  the  projections  of  the  vectors  in 
some  Cartesian  basis  i,  j,  k (Sec.  9.1): 

a = aj  + aj  + az  k,  b = bxi  + bj  + bz  k (4 

If  one  takes  note  of  the  fact  that 

ii  = jj  = kk=  1,  i-j  = jk  = ki  = 0 

(why  is  this?),  then,  substituting  for  a and  b their  expansions  (4),  we 
get 

a ’ b = axbx  + ayby  + azbz  (5) 

(verify  this  !).  As  was  pointed  out  at  the  end  of  Sec.  9.1,  the  projec) 
tions  of  two  vectors  a and  b on  the  coordinate  axes  generally  changes 
under  a rotation  of  the  axes ; however,  the  right  part  of  (5)  remains 
unchanged  (invariant),  since  it  is  equal  to  the  left  side;  the  definition 
of  the  scalar  product  was  given  irrespective  of  the  location  of  the  axes. 

If  in  formula  (5)  we  set  b = a,  then  we  have  an  expression  for 
the  square  of  the  modulus  of  the  vector  in  terms  of  its  Cartesian 
coordinates : 

a - a = | a |2  = arx  -f  a*  + a\ 

For  a vector  in  a plane  we  get  | a |2  = a\  + aj.  This  formula  is  equi- 
valent to  the  Pythagorean  theorem,  and  the  preceding  one  is  equi- 
valent to  an  analogue  of  the  Pythagorean  theorem  for  space  (the  square 
of  a diagonal  of  a rectangular  parallelepiped  is  equal  to  the  sum  of 
the  squares  of  three  edges). 

Exercises 

1.  Two  vectors  of  unit  length  form  an  angle  9 = 30°.  Find  their 
scalar  product. 

2.  Find  the  scalar  product  of  the  vectors  depicted  in  Fig.  107. 

3.  Compute  the  angle  0 between  the  vectors 

F = ]/  3i  + j and  G = — f 3i  -f-  j 

4.  Prove  that  vectors  having  their  origin  at  the  point  A{—  1,  1) 
and  their  termini  at  the  points  B(  1,2)  and  C(0,  — 1),  respectively, 
are  perpendicular. 

5.  A parallelogram  A BCD  is  constructed  on  the  vectors  AB  = F 
and  AD  = G.  Express  the  diagonals  AC  and  DB  in  terms  of  F 
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Fig.  107 


6. 

7. 


and  G,  consider  AC-  AC  and  DB-  DB,  and  then  prove  the  theo- 
rem : the  sum  of  the  squares  of  the  diagonals  of  the  parallelogram 
is  equal  to  the  sum  of  the  squares  of  all  its  sides. 

In  a cube  of  side  a find  the  length  of  the  (inner !)  diagonals ; the 
angles  between  the  diagonals ; the  projections  of  the  sides  on  the 
diagonals. 

A regular  tetrahedron  with  side  a is  located  so  that  one  of  its 
vertices  lies  at  the  coordinate  origin,  another  on  the  positive 
#-axis,  and  a third  in  the  first  quadrant  of  the  xy-plane.  Find  the 


coordinates  of  all  vertices  and  of  the  centre  of  the  tetrahedron 
and  also  the  angle  between  the  straight  lines  issuing  from  the 
centre  to  the  vertices. 


9.3.  The  derivative  of  a vector 

Let  us  find  the  derivative  of  a vector  that  is  dependent  on  some 
variable,  say  the  time  t , with  respect  to  this  variable. 

Lay  off  the  vector  A(/)  from  some  point  0.  Then  the  terminus 
of  the  vector  will,  as  t varies,  trace  out  a line  (Z,)  (Fig.  108). 

Take  a very  small  dt  and  form  the  ratio 

A(/  + dt)  - A(/) 
dt 

This  ratio  is  also  a vector.  (The  vector  A(t  + dt)  — A(tf)  is  shown  in 
Fig.  108  by  the  line  segment  MXM2.)  It  is  the  derivative  of  the  vector 
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Fig.  108 


A (t)  with  respect  toJ:he  variable  t and  for  this  reason  is  denoted 

by  — so  that 
dt 

dA  A(t  + df)  — A(/)  (f\ 

dt  dt  ' ' 

(To  be  more  exact,  we  should  put  the  limit  sign  in  the  right  member 
as  dt  0.) 

Formula  (6)  can  also  be  rewritten  thus: 

A(t  + dt)  = A(t)  + — dt  (7) 

dt 

to  within  infinitesimals  of  higher  order.  As  in  the  case  of  ordinary 
functions,  we  can  write  the  Taylor  series 

A(0  = a (t0)  + B(/0)  ( t - 10)  + D(g  i (t  - y 2 + ... 
where  B(/0)  = — , D(y  = and  so  forth. 

dt  i-ti  dt 2 t-t. 

It  is  clear  that  for  very  small  dt  the  points  M2  and  M±  of  (L) 

dA 

are  very  close  together,  so  that  the  vector  — is  directed  along  the 

dt 

tangent  to  it. 

If  C is  a constant  vector,  then  C (t  + dt)  — C (t)  = 0,  so  that  in 
this  case  — = 0. 

dt 

Using  the  definition  of  a derivative,  it  is  easy  to  prove  the  follow- 
ing two  formulas: 

1.  — [«jA i(^)  + a2A2(t)]  = ^ + a2  dA > where  au  a2  are 

dt  dt  dt 

constant  scalars. 
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2.  A(t)]  = %A(t)  +A*)~  In  particular,  ^[/W  C]  = ^ C 

if  the  vector  C is  a constant.  Thus  we  see  that  the  derivative  of  a vector 
of  type  f(t)  C is  parallel  to  the  vector  itself  (whereas  it  is  evident  from 
Fig.  108  that  this  is  not  so  in  the  general  case). 

Let  us  find  the  derivative  of  a scalar  product.  Suppose  A and  B 
are  two  variable  vectors.  By  the  definition  of  a derivative, 


Neglecting  the  term  containing  dt2,  we  finally  get 

— (A-B)  = A — + — B (8) 

dt  dt  dt  w 


Thus,  the  formula  has  the  same  aspect  as  the  formula  for  the  deri- 
vative of  a product  of  two  scalar  functions. 

In  particular,  putting  A = B in  (8),  we  get 

— (A  • A)  = A • — + — • A = 2A  • — (9) 

dt  dt  dt  dt 


But  A • A = | A |2  and  so  — (A  • A)  = 2|  A 

dt 

results,  we  obtain 


A dA  ,,,  . rf|A| 
= A * — j that  is,  — — - ; 

dt  dt 


Equating  these 


a dA 
A • 


From  this  it  is  easy  to  see,  in  particular,  that  if  the  vector  A (t)  has  a 
constant  modulus  and  only  the  direction  changes,  then  the  vector 

— is  perpendicular  to  the  vector  A.  Indeed,  then  | A | = constant, 

dt 

or  UAl  — 0,  and  by  virtue  of  this  last  formula,  A ■ — = 0,  and  the 

dt  dt 

dA 

fact  that  the  scalar  product  is  equal  to  zero  means  that  A and  — are 
perpendicular. 
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Exercise 

Find  the  angle  that  the  helical  curve  x — R cos  co t,  v = R sin  c ott 
z = vt  forms  with  its  axis  (z-axis). 


9 A The  motion  of  a material  point 

When  a material  point  (particle)  is  in  motion  in  space,  the  relation- 
ship between  the  positions  of  the  point,  its  velocity  and  acceleration 
are  relationships  between  vectors.  Suppose  the  position  of  the  point 
is  described  by  the  vector  r drawn  from  the  origin  to  this  point,  the 
velocity  by  the  vector  u,  and  the  acceleration  by  the  vector  a.  Then 
from  the  definition  of  a derivative  we  immediately  have 

dr  du  d2i 

U = — y a = =?  

dt  dt  dt 2 


Let  us  write  down  the  vector  r in  terms  of  the  unit  vectors  of  the 
coordinate  axes: 

r ==  xi  + yj  + zk 

Here,  x,  y,  z are  the  projections,  which  vary  with  time  as  the  point 
moves,  and  the  vectors  i,  j,  k are  constants. 

And  so 


dr  d / • . • ■ i \ dx  » , dy  ■ . dz  . 

“-■S  = i(M  + 3,J  + 2k)  = 17,  + ^J  + ¥k 


Hence,  the  projections  of  the  velocity  vector  u on  the  coordinate 
axes  are  equal  to  the  derivatives  of  the  corresponding  projections  of 
the  vector  r: 


u 


x 


dx 

y 

dt 


dy 

dt 


Ux 


dz 

dt 


In  the  same  way,  for  the  acceleration  we  get 

dnx  d2x  ^ duy  d2y  ^ duz  d2z 

x~~  dt  ~~  dt2'  y dt  dt2'  2 ~ dt  ~~  dt2 


By  virtue  of  (10),  the  rate  of  change  of  the  distance  |r|  of  the 
point  from  the  origin  is 

dr 
r • — 

d\z\ dt r • u 

~dT~  I r | -777 

Using  this  formula  it  is  easy  to  find,  for  example,  the  condition 
under  which  the  distance  of  the  point  from  the  origin  remains  unaltered, 
i.e.  | r | = constant.  This  is  precisely  what  happens  in  the  following 
two  cases.  Firstly,  if  u = 0 (the  point  is  motionless  here),  and,  secondly, 
if  at  each  instant  of  time  the  velocity  u is  perpendicular  to  the  vector 
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r (in  which  case  the  body  is  in  motion  over  a sphere  of  radius  | r | 
with  centre  at  the  origin). 

Similarly,  from  (9)  we  get 

lul2  — 2(a-  u)  (11) 

at 

From  (11)  it  follows  that  the  velocity  is  constant  in  two  cases:  if  the 
acceleration  is  zero  or  if  the  acceleration  is  not  zero  but  is  perpendicular 
to  the  velocity. 

The  equation  of  motion  of  a material  point  has  the  form  ma  = F, 
where  m is  the  mass  of  the  point  and  F is  the  force  acting  on  the  point. 
(Newton's  second  law).  Multiply  both  sides  of  this  equation  by  the 
velocity  vector  u to  get  a scalar  product: 

m( a*  u)  = F - u 

Using  formula  (11),  we  get 

YYl  d . in  « 

— — u 2 = F*u 

2 dt  1 

or 

5(fWs)  = F-“  <I2> 

We  will  show  that  the  quantity  F*  u is  the  power  of  the  force  F. 
Indeed,  during  the  time  dt  the  point  moves  by  the  amount  dr  and  in 
doing  so  performs  work  F • dr.  The  ratio  of  this  work  to  the  time  dt , 
equal  to 

F • dv  dv  ■y» 

F-F-  u 

dt  dt 

is  the  work  performed  in  unit  time,  which  is  the  power. 

If  we  set  — |u|2  = T,  equation  (12)  takes  the  form  — = F - u, 
2 dt 

h 

whence  T2  — Tx  = ^ F • u dt  (on  the  right  we  have  the  work  done 

*1 

by  the  force  F). 

Thus,  from  Newton's  second  law  it  follows  that  there  is  a definite 
quantity  T expressed  in  terms  of  the  mass  and  velocity  of  the  moving 
point  and  it  is  such  that  the  increase  in  T in  the  process  of  motion 
is  exactly  equal  to  the  work  of  the  external  force  F.  This  is  what 
justifies  calling  T the  kinetic  energy  of  a moving  material  point. 

Sometimes,  for  the  parameter  defining  the  position  of  a point 
moving  along  a path  one  takes  the  length  s of  the  arc  reckoned  along 
the  path  from  some  reference  point,  i.e.,  we  set  r = r(s).  Since  the 
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ratio  of  the  length  of  an  infinitesimal  arc  to  the  chord  subtending  the 
arc  tends  to  unity  (by  virtue  of  the  fact  that  the  arc  hardly  has  time  to 
change  its  direction,  or  to  "bend”),  it  follows  that,  as  As  0, 


|Ar| 

dr 

= lim 

Ar 

As 

^ 1 } 

Hid L Ibj 

ds 

As 

(13) 


Therefore  the  derivative  drfds  is  a unit  vector  directed  along  the  tan- 
gent to  the  trajectory.  This  vector  is  customarily  denoted  by  t, 
whence 


u 


dr 

dt 


dr 

ds 


ds 

— = XU  — UX 
dt 


(H) 


where  u = ] u | . 

Also  from  (13)  follows  an  expression  for  the  differential^)!  the  arc 
in  Cartesian  coordinates: 


ds  = | dx  | = | d(x i + yj  + zk)  | = | dxi  + dy]  + dzk  J 


= 1 1 dx1  -j-  dy 2 dz2 


Since  | x(s)  | = 1 = constant,  it  follows  (see  Sec.  9.3)  that  — 1 t. 

ds 

Thus  the  straight  line  pp  (Fig.  109)  drawn  through  the  running 
point  M of  the  trajectory  (L)  parallel  to  dx/ds  will  serve  as  the  normal 
to  ( L ),  that  is,  as  a perpendicular  to  the  tangent  line  ll  at  the  point 
M.  To  distinguish  it  from  the  other  normals  (at  every  point  of  the 
line  in  space  it  is  possible  to  draw  an  infinity  of  normals  that  will  fill 
the  entire  “normal  plane”),  it  is  called  the  principal  normal  to  the 
curve  (. L ) at  the  point  M.  The  length  of  the  vector  dxjds  is  called  the 
curvature  of  the  curve  ( L ) at  the  point  M and  is  denoted  by  k ; 
that  is, 

= k,  — = kn 
ds 


dx 

ds 


where  n is  the  unit  vector  of  the  principal  normal  (see  Fig.  109). 

The  geometrical  meaning  of  the  curvature  is  seen  in  Fig.  110: 


1 dx 

~ lim 

At 

j ds 

As 

r b c v a 

= lim = lim  — 

As  As 


where  a is  the  angle  between  the  tangents  to  (L)  at  close  points  B and 
C;  in  the  last  passage  to  the  limit  we  replaced  the  line  segment  BC 
by  the  arc  of  a circle  of  unit  radius.  Thus,  the  curvature  is  the  rate 
of  rotation  of  a tangent  per  unit  length  of  the  path  traversed.  From 
this  it  is  evident,  for  example,  that  at  all  points  of  a circle  of  radius  R 
the  curvature  k = \jR  (this  is  discussed  in  HM,  Sec.  4.5,  where  the 
curvature  of  a plane  curve  is  analyzed).  It  is  also  apparent  here  that 
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Fig.  110 


T 


the  vector  x'  = — and,  with  it,  n as  well  are  in  the  direction  of  the 

ds 

bending  curve. 

Differentiating  (14)  with  respect  to  t,  we  get 

d2 i dn  i dii  . ds  du  . of  t * r\ 

a = — — — t + « — = — x+w—  — = — x + u2k  n (15) 

dt2  dt  dt  dt  ds  dt  dt 

This  formula  is  widely  used  in  physics,  since  it  gives  a resolution  of  the 
acceleration  vector  into  a “tangential”  component  (i.e.  one  along  the 
tangent)  and  a normal  component ; the  latter,  as  we  see,  is  directed 
along  the  principal  normal. 

Thus,  since  the  vector  a is  laid  off  from  point  My  it  must  necessarily 
lie  in  the  plane  passing  through  the  tangent  line  and  the  principal  nor- 
mal drawn  at  this  point.  This  plane  is  termed  the  osculating  plane  to  the 
curve  (L)  at  the  point  M.  It  can  be  proved  that  this  is  nothing  but  a 
plane  passing  through  three  points  of  the  curve  (. L ) that  are  located 

infinitely  close  to  M.  (Just  like  a tangent  is  a straight  line  drawn 

through  two  infinitely  close  points  of  a curve.) 
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From  formula  (15)  there  follow  very  important  conclusions  on 
the  forces  acting  on  a point  in  the  course  of  motion.  Put  the  expression 
for  a taken  from  (15)  into  the  formula  of  Newton’s  second  law : F = m&. 
We  see  that  the  acting  force  has  a tangential  component 

Ft  = .Ftt  = m — x (16) 

dt 


and  a normal  component 

Fw  = Fn  n = mu2k  n ( 1 7) 


directed  along  the  principal  normal  to  the  trajectory.  The  osculating 
plane  to  the  trajectory  at  some  point  is  the  plane  of  the  vectors  u 
and  F at  this  point. 

From  formula  (16)  we  get 


uF „ 


dT 

dt 


whence  the  increment  in  the  kinetic  energy  is  equal  to 
T2  — 7\  = ^ uFt  dt  = ^ ds 

* i 


We  see  that  the  work  is  performed  only  by  the  tangential  component 
of  the  force.  The  normal  component  (which  is  also  called  the  centri- 
petal force)  does  not  alter  the  velocity  of  the  point  but  generates  a 
curving  of  the  path,  the  curvature,  by  (17),  being  equal  to 


k = 


Fn 

mu2 


Recall  that  the  dimensions  of  curvature  (cm-1)  are  inverse  to  the  di- 
mensions of  length. 


Exercise 


Find  the  curvature  of  the  helical  curve  given  in  the  Exercise  to 
Sec.  9.3. 


9.5  Basic  facts  about  tensors 

We  have  already  mentioned,  at  the  end  of  Sec.  9.1,  that  if  we 
describe  a vector  by  the  triad  of  its  Cartesian  projections,  then  we 
have  to  bear  in  mind  that  this  triad  is  essentially  dependent  on  the 
choice  of  the  Cartesian  basis.  Let  us  now  investigate  this  in  more  detail. 
We  will  denote  the  vectors  of  the  basis  by  elf  e2,  e3  and  the  projections 
of  the  vector  a by  the  symbols  av  a2t  a3 , so  that 


a — -F  #2*^2  T"  ^3^3 


(18) 
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Now  suppose  we  have  chosen  a new  Cartesian  basis  e(,  e^,  that  can 
be  expressed  in  terms  of  the  old  basis  by  the  formulas 

3 

ei  = afiei  T~  ai2e2  “1“  ai3C3  ~ (i  = 1,  2,  3) 

j=l 

In  tensor  algebra  this  is  customarily  written  compactly  as  follows: 

e'i  = (19) 

with  the  convention  that  the  summation  is  over  the  repeated  index  in 
accord  with  the  dimensionality  of  the  space;  for  three-dimensional 
space,  we  have_/  = 1,  2,  3.  This  is  a dummy  index,  which  means  it  can 
be  denoted  arbitrarily:  = a^e*.  = and  so  on. 

Forming  the  scalar  product  of  both  sides  of  (19)  by  e;,  we  get 
0L{j  = e'*e;.  Similarly,  from  the  formulas  ef  = p^e',  which  express  the 
old  basis  in  terms  of  the  new  basis,  we  find  that  Pl7  = e*  • e'..  But 
this  means  that  p0  = <xjt ; the  old  basis  is  expressed  in  terms  of  the 
new  basis  by  the  formulas 

e<  = ajfii  (2°) 

(This  equation  can  be  rewritten  as  so  that  in  the  equation 

(19)  connecting  the  Cartesian  bases  we  can  simply  “transpose”  the 
factor  from  one  side  to  the  other.) 

Substituting  the  expressions  (20)  into  (18)  and  denoting  by  a\ 
the  projections  of  the  vector  a in  the  basis  e',  we  get 

a = J2  «ie<  = 5D  ai  2 aiiei  = 2 Xjiaiei 

t * J *,J 

Changing  the  notations  i j,  we  get 

a = ^ ^ (X  ^ = y **  (ifii 

t,  j i j i 

whence 

ai  — aijaj  (^  1) 

Comparing  this  with  (19),  we  see  that  the  projections  of  any  vector 
transform  by  the  same  formulas  as  the  basis  vectors. 

In  the  foregoing,  a need  not  be  regarded  as  a “geometric”  vector, 
that  is,  a directed  line  segment  in  a plane  or  in  space.  It  may  be  a 
force  vector  or  a velocity  vector,  and  so  forth;  in  all  cases  its  pro- 
jections transform  via  the  formulas  (21).  It  is  also  easy  to  verify  that, 
conversely,  any  triad  of  quantities  can  be  interpreted  as  a triple  of 
coordinates  of  some  vector,  i.e.  they  can  be  regarded  as  defining  a 
vector,  only  if  they  acquire  specific  values  after  indicating  a Cartesian 
basis  and  if  they  transform  via  the  formulas  (21)  under  the  change 
of  basis  (19).  (On  the  other  hand,  it  is  hardly  of  any  use  to  interpret 
just  any  triad  of  quantities  as  a vector.  For  example,  in  the  study 
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of  a gas  flux  we  can  regard  the  triple  of  temperature  0,  pressure  p 
and  density  p,  as  characterizing  the  state  of  the  gas  at  a point  of 
space.  But  it  is  not  wise  to  interpret  this  triad  as  a vector,  since  rota- 
tions of  the  basis  in  space  do  not  affect  the  values  of  the  quantities 
that  make  up  the  triad.  Three  scalar  quantities  do  not  constitute  a 
vector). 

There  are  quantities  that  transform  via  a more  complicated  law 
than  (21)  under  a change  of  basis  (19).  An  important  instance  of  such 
quantities  is  obtained  when  considering  a linear  mapping  of  space  into 
itself.  We  say  that  there  is  a mapping  T if  to  every  vector  a is  asso- 
ciated a vector  Tsl  (of  the  same  space),  and  to  the  sum  of  the  inverse 
images  corresponds  the  sum  of  the  images,  or  T(a  + b)  = Ta  + Tb, 
T(X a)  — XTa.  Instances  of  linear  mappings  are:  a rotation  of  space 
about  a point,  the  uniform  contraction  of  space  to  a plane,  a straight 
line,  or  a point,  and  so  forth  (think  this  through !) 

To  describe  such  a mapping  by  numbers,  choose  a basis  e{  in 
space  and  expand  each  of  the  vectors  Tet  in  terms  of  this  basis: 

Tei=Piie,  (22) 

(note  the  order  of  the  indices).  A knowledge  of  the  nine  coefficients 
p permits  finding  the  image  b = Ta  of  any  vector  (18).  Indeed, 

i>  = 5D  biei  = t(J2  afii)  = Z aiPnei = Z 

i i i,  j i,  j 

whence  bi  = pi}ar 

The  set  of  coefficients  pXj  depending  on  two  indices  is  frequently 
written  in  the  form  of  a matrix ; 

(fill  fi  12  fi  13 

fi  21  fi  22  ^23 
fiz  1 fiz  2 fizz 

Now  let  us  see  how  the  coefficients  ptj  change  under  a transition 
to  a new  basis  e-.  Since  by  virtue  of  (22)  pX]  = eif  we  get,  taking 
into  account  formula  (19), 

fi i)  ~ ‘ ei  = = a* k^jlfikl 

From  this  it  can  be  shown  that  the  sum  of  pit  remains  invariant 
under  such  a change,  which  means  it  is  a scalar.  Indeed, 

fiii  = ^ik^ilfikl  — ei  * ^k^ilfikl  ~ ejb  * (etafl)  fikl  ~ e&  * *lfikl  ~ fiii 

In  the  example  given  here  the  quantities  ptj  were  nondimensional. 
Conversely,  it  can  be  verified  that  any  set  of  nondimensional  quanti- 
ties pXi  that  transform  via  the  formulas  (23)  under  the  change  of  Carte- 
sian basis  ( 1 9)  can  be  interpreted  as  the  matrix  of  a linear  mapping 
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of  space  into  itself.  In  Sec.  11.5  we  will  give  examples  of  dimensional 
quantities  with  a different  physical  meaning,  but  such  that  also  trans- 
form via  the  formulas  (23)  under  a change  of  basis. 

Let  us  rewrite  formulas  (21)  and  (23)  and  change  the  indices: 

ai'  = > Pi'j'  ~ Cci'i<xj'jPij 

We  see  that  these  formulas  have  a common  structure.  Generalizing, 
we  arrive  at  a law  of  transformation  of  quantities  that  depend 

on  m indices  i,  j , ...,/,  each  of  which  takes  on  the  values  1,2,  3 
(for  three-dimensional  space)  or  1,  2 (for  a plane,  or  two-dimensional 
space) : 

Qi'j’-.  l'  ~ al'l ?»/...*  (24) 

If  the  quantities  assume  definite  values  only  after  indication 

of  the  Cartesian  basis  and  if  they  transform  via  the  formulas  (24) 
(on  the  right  we  have  an  m-fold  sum!)  under  the  change  of  basis  (19), 
then  we  say  that  the  quantities  constitute  a tensor  of  rank  m. 

Thus,  the  set  of  projections  of  a vector  form  a tensor  of  rank  one,  the 
set  of  coefficients  of  a linear  mapping,  a tensor  of  rank  two  (in  every 
basis  this  set  of  coefficients  is  specific,  but  the  set  defines  one  and  the 
same  mapping  irrespective  of  the  choice  of  basis).  Note  that  from  the 
viewpoint  of  this  definition,  a tensor  of  rank  zero  is  to  be  regarded  as 
a quantity  that  takes  on  a numerical  value  in  the  chosen  units  of 
measurement  and  that  is  invariant  under  a change  of  basis ; in  other 
words,  it  is  a scalar. 

We  can  also  approach  the  tensor  concept  differently.  For  the  sake 
of  definiteness,  consider  a second-rank  tensor,  that  is,  a set  of  quanti- 
ties that  transform  via  formulas  (23)  under  a change  of  basis  (19). 
With  this  set  let  us  associate  the  following  formal  expression: 

3 

p = H Pififii  (25) 

i,  3 = 1 

Here,  the  tensor  product  eft  (not  to  be  confused  with  the  scalar  pro- 
duct et  • e3!)  does  not  reduce  to  any  other  simpler  object  like  a vector  or 
a scalar.  The  factors  in  a tensor  product  cannot  be  interchanged:  eft 
=f=  eft , but  in  a tensor  product  of  sums  the  brackets  can  be  removed 
by  the  ordinary  rules,  so  long  as  we  keep  to  the  order  of  the  factors, 
for  example: 

(2ex  + e2  — e3)  • (ex  + 3e3)  = 26^  + 6e1e3  + e2ex 

-J-  3CjCg  e3ei  3e3e3 

In  the  new  basis  e\  we  have  to  write 

p'  = 

j 
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o 


(e) 


Fig.  Ill 


X 


1 


instead  of  (25).  However,  taking  into  account  the  formulas  (23)  and 
(20),  we  get 


^ ccacccjlPkiei^f  — '2—fPlcli'^/  aaei)  afle*)  — PkfihPl  ~~  ^ 

i,  jtk,l  kl  \ i } \ j ) k,  l 


Thus,  expression  (25),  which  is  called  a tensor,  is  invariant  under  a 
change  of  the  Cartesian  basis.  In  similar  fashion  we  introduce  a tensor 
of  any  rank,  with  the  first-rank  tensor  ^ aiet  being  a vector. 

i 

Tensors,  tensor  algebra,  and  tensor  analysis  (which  handles  the 
rules  for  differentiating  tensors)  play  a very  important  role  in  modern 
physics.  The  basic  laws  of  the  theory  of  elasticity,  electrodynamics, 
optics  of  an  anisotropic  medium,  and  so  forth  are  all  stated  in  tensorial 
terms.  We  are  not  forced  to  associate  the  consideration  of  a problem 
with  any  one  artificially  chosen  system  of  coordinates  that  is  not  justi- 
fied by  the  essence  of  the  investigation.  It  is  to  be  observed  that  in 
many  cases  one  cannot  limit  oneself  to  Cartesian  bases,  in  which  case 
it  becomes  necessary  to  alter  the  definition  of  a tensor  so  that  it  re- 
mains invariant  under  a change  to  any  basis  (not  necessarily  a Carte- 
sian basis).  We  will  not  dwell  on  this  altered  definition  here. 

Albert  Einstein  applied  tensor  analysis  to  electrodynamics  and 
the  theory  of  gravitation,  thus  attracting  the  attention  of  physicists 
and  mathematicians  to  this  field.  In  fact,  the  best  brief  exposition  of 
the  theory  of  tensors  is  given  in  Einstein's  book  The  Meaning  of 
Relativity  [4]. 


Exercise 

Write  the  matrices  of  the  linear  mappings  of  a plane  into  itself 
given  in  Fig.  111.  Indicate  into  what  the  square  is  carried  for 
different  directions  of  its  sides. 
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9.6  Multidimensional  vector  space 

In  Sec.  4.8  we  saw  that  the  concept  of  a multidimensional  space 
can  be  approached  in  two  ways:  either  by  proceeding  from  a numerical 
scheme  or  by  considering  a system  with  many  degrees  of  freedom. 
Analogous  approaches  are  possible  to  the  idea  of  a multidimensional 
vector  space,  which  is  distinguished  from  general  spaces  by  the  possi- 
bility of  performing  linear  operations.  Let  us  consider  the  first  approach : 
for  the  sake  of  definiteness  we  will  discuss  a four-dimensional  space 
(the  general  case  is  considered  analogously). 

As  we  have  seen,  a point  in  four-dimensional  space  may  be  any 
set  of  four  numbers  (xv  x2,  x3,  x4).  The  unit  vector  ex  of  the  first  coordi- 
nate axis  (the  *raxis)  is  to  be  pictured  as  a line  segment  issuing  from 
the  coordinate  origin  (0,0,0,0)  and  with  terminus  at  the  point  (1,0, 0,0). 
Incidentally,  because  of  the  possibility  of  translating  any  vector  pa- 
rallel to  itself,  et  may  be  regarded  as  a line  segment  with  its  origin  at 
any  point  (xx,  x2,  x3i  x4)  and  its  terminus  at  the  point  [xx  + 1, 
x2,  x3>  #4)-  The  unit  vectors  e2,  e3,  e4  of  the  other  three  coordinate  axes 
are  introduced  similarly.  By  considering  a linear  combination  of  these 
four  vectors, 

x = xxex  + *2e2  + *3e3  + #4e4  (26) 

we  obtain  a vector  that  issues  from  the  coordinate  origin  and  extends 
to  the  point  (xXi  x2t  x3,  x4),  that  is,  the  radius  vector  of  that  point. 
The  vector  x may  be  laid  off  from  any  point  other  than  the  origin  of 
coordinates.  In  any  case,  the  coefficients  xlt  x2,  x3,  x4  in  formula  (26) 
are  the  projections  of  x on  the  coordinate  axes. 

The  operations  involving  four-dimensional  vectors  are  performed 
by  the  same  formal  rules  that  govern  two-dimensional  and  three- 
dimensional  vectors.  The  simplest  thing  is  to  consider  vectors 
resolved  along  the  coordinate  axes,  that  is,  as  represented  in  the  form 
(26).  If  besides  the  vector  (26)  we  have  a vector  y = yxex  + y2e2+ 

+ 3'se3  + v4e4-  then  * 

X + y = (*1  + yi)  e,  + (*a  + y2)  e2  + (x3  + y3)  e3  + (x4  + v4)  e4> 

Xx  = "kxxex  -f-  X*2e2  + X#3e3  + X#4e4  (X  a scalar), 

X • y = xlVl  + x2y2  + x3y3  + x4y4  (27) 

It  can  be  verified  that  all  the  basic  properties  described  above  for  two- 
dimensional  and  three-dimensional  vectors  hold  true  also  for  four- 
dimensional vectors.  However,  the  properties  associated  with  the  idea 
of  linear  dependence  (Sec.  9.1)  require  special  attention.  The  reason 
is  that  four-dimensional  space  has  planes  of  two  kinds : two-dimensional 
and  three-dimensional  planes.  The  two-dimensional  plane  results  if 
we  take  two  linearly  independent  (i.e.  nonparallel)  vectors  and  then 
lay  off  all  possible  linear  combinations  of  them  from  some  fixed  point, 
say  the  coordinate  origin.  If  this  is  done  with  three  linearly  independent 
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vectors,  then  we  get  a three-dimensional  plane  (also  spoken  of  as 
a “three-dimensional  hyperplane”)  in  four-dimensional  space.  Now  if 
we  take  four  linearly  independent  vectors  (that  is,  such  as  are  not  paral- 
lel to  a single  three-dimensional  plane),  then  their  linear  combinations 
will  fill  all  the  space,  which  means  that  any  fifth  vector  can  be  resolved 
in  terms  of  these  four.  Which  means  that  in  four-dimensional  space,  any 
four  linearly  independent  vectors  can  be  taken  as  a basis  (cf.  Sec. 9.1). 

We  now  consider  an  alternative  approach  to  the  concept  of  a multi- 
dimensional vector  space.  We  proceed  from  some  collection  R of 
arbitrary  entities,  upon  which  we  can  perform  linear  operations  with- 
out going  outside  R (linear  operations,  it  will  be  recalled,  involve 
adding  and  multiplying  by  a scalar).  Besides  this  it  is  also  required 
that  the  very  same  properties  indicated  in  Sec.  9. 1 for  linear  operations 
on  vectors  in  a plane  and  in  space  hold  true.  Then  R is  called  a linear 
(or  vector)  space , and  the  component  entities  are  termed  (generalized) 
vectors . 

The  basic  numerical  characteristic  of  a specified  vector  space  is 
the  maximum  possible  number  of  linearly  independent  vectors,  which 
is  also  called  the  dimensionality  of  the  space.  For  example,  if  R is 
four-dimensional,  this  means  that  in  it  we  can  indicate  four  linearly 
independent  vectors  av  a2,  ag,  a4  such  that  any  other  vector  x in  R 
can  be  represented  as 

x = -f-  X2a2  + X3a3  4"  ^4a4  (28) 

We  can  thus  take  these  vectors  a4,  a2,  a3,  a4  in  R for  a basis.  Since 
the  coefficients  of  the  expansion  in  (28)  can  assume  all  possible  values 
for  distinct  vectors  x,  it  follows  that  in  choosing  x in  R there  are  four 
degrees  of  freedom,  which  means  the  space  R is  four-dimensional  also 
in  the  sense  of  Sec.  4.8. 

It  may  happen  that  in  a linear  space  it  is  possible  to  indicate  an 
arbitrarily  large  number  of  linearly  independent  vectors.  Such  a space 
is  said  to  be  infinite-dimensional.  An  example  of  an  infinite-dimensional 
space  will  be  examined  in  Sec.  14.7. 

If  in  a finite-dimensional  linear  space  we  introduce  a scalar  pro- 
duct that  satisfies  the  properties  described  in  Sec.  9.2  for  an  ordinary 
scalar  product,  then  this  space  is  said  to  be  Euclidean.  In  Euclidean 
space  it  is  natural  to  introduce  the  following  notions : the  modulus  of 
a vector  (via  the  formula  |x|  = fx  • x),  the  unit  vector,  and  the 
orthogonality  of  vectors  (if  x-  y = 0).  It  is  customary,  in  Euclidean 
space,  to  choose  an  orthogonal  basis  and  not  just  any  basis.  Thus, 
for  example,  for  the  four-dimensional  case  we  have  a collection  of  four 
mutually  orthogonal  vectors  a4,  a2,  a3,  a4.  In  this  case,  the  coefficients 
of  the  expansion  (28)  of  any  vector  x are  readily  found  in  the  following 
manner : form  the  scalar  product  of  both  sides  of  (28)  by  the  vectors 
a^  to  get 

x-  a,  = X,(a,-  a,) 
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The  remaining  terms  on  the  right  side  drop  out  because  of  the  condi- 
tion of  orthogonality.  And  so 

h = (j=  1,  2,  3,  4)  (29) 

The  existence  of  a scalar  product  makes  possible  rotations  in 
linear  space,  which  then  acquires  the  property  of  isotropy  (the  pro- 
perties of  the  space  are  the  same  in  all  directions).  The  vector  space 
thus  becomes  more  valuable.  Recall  the  characteristic,  mentioned  in 
Sec.  9.5,  of  the  state  of  a gas  at  a point  given  by  the  quantities  0, 
p , p.  Increments  of  these  quantities  can  be  subjected  to  linear  opera- 
tions, which  means  triples  of  such  increments  form  a three-dimensional 
linear  space.  But  this  space  is  devoid  of  a scalar  product  and  of  rota- 
tions, which  means  it  is  not  Euclidean  and,  hence,  deficient  in  a sense.* 

There  are  also  what  are  known  as  pseudo-Euclidean  (or  quasi- 
Euclidean)  spaces  in  which  the  square  of  the  modulus  of  the  vector 
(the  scalar  product  of  the  vector  into  itself)  can  be  positive,  negative, 
or  zero.  Such  is  (by  virtue  of  the  theory  of  relativity)  the  four-dimen- 
sional space  x,  y,  z,  t of  Cartesian  coordinates  and  time.  Rotations  are 
possible  in  pseudo-Euclidean  space,  although  not  all  directions  in  it 
are  of  equal  status. 

Up  to  now  we  assumed  scalars  to  be  arbitrary  real  numbers,  in 
which  case  the  linear  space  is  called  a real  linear  space . But  we  can 
also  consider  complex  linear  spaces  in  which  the  scalars  are  arbitrary 
complex  numbers.  Then  a new  factor  appears  in  defining  a Euclidean 
space:  it  is  required  that 

(y,  x)  = (x,  y )*  (30) 

That  is,  when  the  factors  in  a scalar  product  are  interchanged,  the 
product  is  replaced  by  the  conjugate  complex  number.  However,  if 
in  (30)  we  set  y = x,  then  we  see  that  the  scalar  square  of  any  vector 
is  a real  number  (and,  as  may  be  verified,  it  is  even  positive,  so  that 
the  modulus  of  the  vector  may  be  found  from  the  same  formula  as 
before,  and  this  modulus  turns  out  to  be  positive).  Besides,  when  taking 
the  scalar  factor  outside  the  sign  of  the  scalar  product,  we  make  use 
of  the  following  formulas: 

(Xx,  y)  = X(x,  y),  (x,  Xy)  = X*(x,  y) 

The  simplest  instance  of  a four-dimensional  complex  Euclidean  space 
is  the  space  considered  at  the  beginning  of  this  section,  but  in  this 
instance  the  numbers  xv  x2,  x3,  x4  must  be  complex  numbers,  and 
instead  of  formula  (27)  we  have  to  use  the  formula 

x-  y = yl  + x2y\  + x3yl  + x4yl 


Do  not  confuse  this  "non-EucIidicity"  with  the  so-called  non-Euclidean  geo- 
metry, which  has  nothing  to  do  with  the  theory  of  linear  spaces. 
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Exercises 

1.  Give  the  formula  for  the  length  of  the  vector  (26). 

2.  Verify  the  fact  that  the  radius  vectors  of  the  points  (1,  1,  1,  — 1), 
(1,  1,  —1,  1)  and  (0,  0,  1,  1)  are  mutually  perpendicular.  Con- 
struct the  fourth  (nonzero)  vector  perpendicular  to  these  three. 

ANSWERS  AND  SOLUTIONS 


Sec.  9.1 

1 a + b . 

2 

Sec.  9.2 


2. 


a -f-  Xb 
1 + X 


3.  AB  = — 3i  + k. 


1. 

2. 

3. 

4. 

5. 


6. 


7. 


£3 

2 

From  the  figure  it  is  evident  that  Fx  = 1,  Fy  — 3,  = 3,  = 1. 

For  this  reason  F*  G = 6. 
cos  6 = — 1/2,  0 = 120°. 

AB  = 2i  -f  j,  AC  = i — 2 j and  so  AB  • ^4C  = 2 — 2 = 0. 

^4C  = F + G,  DB  = F — G,  whence 

AC2  + DB2  = (F  + G)  • (F  + G)  + (F  - G)  • (F  - G) 

= 2F-  F + 2G  G = AB2  + BC2  + CD2  + DA2 

Locate  the  axes  as  in  Fig.  112.  Then  OA  = ai , OB  = aj,  OC  = ak. 
The  lengths  of  all  diagonals  are  the  same ; for  instance  OD  = 
= ai  + + ak,  OD  = f a 2 + a2  + = 1.73a.  The  angles  be- 

tween diagonals  are  also  the  same ; for  example,  the  angle  between 

OD  and  CE  = ai  + aj  — ak  is  computed  from  the  formula 


cos©  = 


OD-CE 


OD-CE  \3a-V3a  3 


that  is,  0 = 70°32\ 


And  the  projections  of  the  sides  on  the  diagonals  are  the  same; 
for  example,  the  projection  of  side  OA  on  the  diagonal  OD 


is 


OA  • OD 


= = 0.58a. 


OD  \3a  ]/3 

Label  the  points  given  in  the  statement  of  the  problem  as  in 
Fig.  113.  It  is  clear  that  0 has  the  coordinates  (0,  0,  0) 
and  ^4  (a,  0,  0).  Let  B have  the  coordinates  (xB,  yB,  0)  and 

C(xc,  yc>  zc )•  Then  OB  = a,  OB  - OA  = a - a- 


cos  60°  = — , that 
2 


IS, 


XB+y%  = a2,  *Ba  = : y ) whence  xB  = -^,yB  = 


ai 3 


Further- 


Answers  and  solutions 


337 


more,  OC  = a , OC  ■ OA  = — > OC  • OB  = A , that  is,  -f  y%  + 


d i XCU  — 2 ) %C  2 


a V 3 a 


— , whence  xc  = -j 1 


Since  point  Z),  by  symmetry,  is  located  exactly  under  C,  we  de- 
note its  coordinates  Zd]  * Equating  the  length  ; 

of  the  vectors  DC  and  DO , we  get  fl/  —a  — z^\  =(—  )2 + 


22—1634 
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4-  |j^j2+  4>  whence  zD  = Finally,  sinceDC  ||  k and 

DO  = — — i -jzr  j ^zr  k,  the  desired  angle  may  be  found 

from  the  formula 


k -DO  2/6  1 

cos  oc  = = — = — : - 

\k\-DO  1/a*  3 

1/  "T  + 12  1 24 

whence  a = 109°28'. 


Sec.  9.3 

The  radius  vector  of  the  point  of  the  helical  curve  is  r = ;ri  + 
+ yj  + = R cos  co£i  + R sin  to/j  + rtfk.  From  this  we  have 

— = — i?co  sin  ati  + cos  j + flk  (31) 


Since  this  vector  is  directed  along  the  tangent  line,  it  remains  to 
find  its  angle  0 with  the  vector  k: 


cos  0 = 


dr 

dt 


- 1 k I 


v 

/i?2o>2  + v 2 


Sec.  9.4 


By  virtue  of  (31),  ds  — \ dr  | = | — sin  coH  + 7?co  cos  cotf  j -j- 
+ vk|  dt  — ^2?2to2  + v2dt.  And  so 

dr  — 2?co  sin  co*  i + cos  o>/  j -f  z/k 
ds  /^jT/72 


dx 

dx 

dt 

_ ifto2 

ds 

dt 

ds 

R2 o>2  + f2 

Sec.  9.5 

»C  “)•  <b)(o :)-  <c 

« c :)• « r;  d- 


In  examples  (b),  (c)  and  (e)  every  square  goes  into  a square.  In 
examples  (a)  and  (d)  the  square  generally  yields  an  oblique-angled 
parallelogram.  If  the  sides  of  the  square  are  parallel  to  the 
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coordinate  axes  in  the  case  (a)  or  have  the  slope  -y  db  K+  > in 

the  case  (d),  then  we  get  a rectangle.  If  the  square  is  rotated  from 
the  indicated  position  through  45°,  the  result  is  a rhombus. 

Sec.  9.6 

1.  -f  A -r  A + A* 

2.  Perpendicularity  is  verified  by  means  of  the  formula  (27)  since  the 
scalar  product  of  any  pair  of  vectors  is  equal  to  zero.  If  the  fourth 
vector  has  the  projections  xv  x2,  x3,  x4J  then  from  the  perpendicu- 
larity condition  we  get  xl  + x2  + x3  — *4  = 0,  x±  + x2  — x3  + 
+ x4  = 0,  x3  + x4  = 0,  whence  it  is  easy  to  derive  that  x3  — 
= x4  — 0,  xx  + x2  = 0-  Thus,  we  can  take  the  projections  of 
the  fourth  vector  to  be  equal  to  1,  — 1,  0,  0. 


Chapter  10 
FIELD  THEORY 


10.1  Introduction 

We  say  that  a field  of  some  quantity  has  been 
specified  in  space  if  the  value  of  the  quantity  is 
stated  at  every  point  in  the  space  (or  in  some 
region  of  it).  For  example,  when  stucfying  a gas 
flow,  one  has  to  investigate  the  temperature  field 
(at  every  point  the  temperature  has  a definite 
value),  the  field  of  densities,  the  field  of  pres- 
sures, the  field  of  velocities,  and  so  forth.  We  have 
a scalar  field  or  a vector  field  depending  on  the 
character  of  the  quantity:  for  instance,  fields  of 
temperatures,  pressures  or  densities  are  scalar 
fields,  whereas  fields  of  velocities  and  forces  are  vector  fields.  A 
field  is  stationary  (or  steady-state)  if  it  does  not  vary  with  time  at 
every  point  of  the  space,  or  it  is  nonstationary  ( nonsteady-state ) if  it 
does  vary. 

For  the  sake  of  definiteness  let  us  take  a scalar  field  and  denote 
it  by  u ; also,  let  us  introduce  a Cartesian  system  of  coordinates  x , y,  z. 
Specifying  these  coordinates  determines  a point  in  space  and  thus  the 
corresponding  value  of  u = u(x,  yt  z).  (If  the  field  is  nonstationary, 
then  u = u(x,  y,  z,  t),  where  t is  the  time.  Here  however  we  do  not 
regard  time  as  a fourth  coordinate  of  equal  status,  but  rather  as  a sort 
of  accessory  parameter  so  that  subsequent  constructions  will  refer  to 
any  fixed  instant  of  time.)  Thus,  from  a purely  formal  point  of  view,  a 
stationary  field  is  merely  a function  of  three  variables  x,  y,  z.  We  must 
bear  in  mind,  however,  that  the  coordinates  can  be  introduced  in  space 
in  different  ways,  and  this  will  cause  the  expression  u(x,  y,z ) to  change. 
But  in  any  given  point  M the  value  of  u will  of  course  be  independent 
of  any  choice  of  the  coordinate  system.  For  this  reason,  we  often  say 
that  u is  a point  function , u = u(M)t  since  specifying  M fully  deter- 
mines the  appropriate  value  of  u,  that  is,  the  value  of  u at  the  point  M. 
When  considering  a field,  a point  function  is  primary  relative  to  a 
function  of  the  coordinates  for  the  reason  that  a field  is  meaningful 
and  can  be  investigated  irrespective  of  any  system  of  coordinates. 


This  chapter  is  a direct  continuation  of  the  preceding  one  and  draws  on  the 
material  of  Secs.  9.1  and  9.2.  Use  will  also  be  made  of  the  concept  of  a mul- 
tiple integral  (Sec.  4.7)  and  Green’s  function  (Sec.  6.2). 
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Scalar  field  and  gradient 

If  the  quantity  under  study  is  specified  in  a plane,  then  the  corres- 
ponding field  is  said  to  be  a plane  field ; such  fields  occur  in  the  study 
of  heat  processes  in  a plate  (lamina)  whose  width  is  disregarded. 

If  the  field  of  u is  a spatial  field,  but  proves  to  be  independent  of 
z in  the  Cartesian  coordinate  system  x,  y,  z,  the  field  is  said  to  be 
plane-parallel.  It  is  then  often  possible  to  disregard  the  ^-coordinate 
by  considering  the  field  in  the  ^y-plane  (i.e.  to  consider  a plane  field 
instead  of  a plane-parallel  one)  and  bearing  in  mind  that  the  field  in 
all  parallel  planes  has  exactly  the  same  form,  that  is,  all  quantities 
describing  the  field  are  constant  on  all  perpendiculars  to  these  planes. 

Exercise 

Let  the  quantity  u have  the  expression  u = x2  + y2  + 2 yz  + z2 

in  a Cartesian  system  of  coordinates.  Prove  that  this  quantity 

forms  a plane-parallel  field. 

Hint.  Turn  the  system  of  coordinates  through  45°  about  the  .r-axis. 

10.2  Scalar  field  and  gradient 

For  the  sake  of  simplicity,  we  will  assume  an  #y£-coordinate  sys- 
tem (Cartesian)  in  space  and  will  consider  the  stationary  scalar  field 
u = u(x,  y}  z).  Let  there  be  a point  M from  which  it  is  possible  to  issue 
in  all  directions;  Fig.  114  shows  one  such  direction  l.  The  derivative 
of  u in  the  direction  l is  the  rate  of  change  of  the  field  in  that  direction 
per  unit  length: 

du  = lim  u(N)  - u(M)  ]im  An 

dl  N-+3I  MN  As->0  As  l 

Reasoning  as  we  did  in  the  derivation  of  formula  (5),  Ch.  4,  we 
get 

du  du  dx  du  dy  du  dz 

dl  dx  dl  dy  dl  dz  dl 

The  right-hand  side  is  conveniently  represented  as  a scalar  product 
of  two  vectors  (see  formula  (5)  of  Ch.  9) : 


The  first  one  is  called  the  gradient  of  the  field  u and  is  denoted  by 

, du  . . du  . , du  , , . x 

grad«  = — i + — ] + — k (1) 

dx  dy  dz 

Its  physical  meaning  will  be  explained  later  on.  The  second  vector 

dx  . dy  . | dz  ^ _ d(x i + y]  + ^rk)  _ dr  _ 

dl  dl^~^  dl  ~ dl  ~~  dl  ~ * 
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is  the  unit  vector  of  the  direction  / (see  Sec.  9.4).  Thus 

— = grad  u * x (2) 

di  ° 

The  first  factor  on  the  right,  for  a given  field  u,  depends  only  on 
the  choice  of  the  point  M . The  second  factor  depends  only  on  the  direc- 
tion l . 

Since  the  scalar  product  of  a vector  by  the  unit  vector  is  merely 
equal  to  the  projection  of  the  first  vector  on  the  second,  formula  (2) 
can  be  rewritten  thus: 

^ = grad,  u (3) 

(this  is  the  notation  for  the  projection  of  a gradient  on  the  direction  l). 

Suppose  we  have  a field  u and  a point  M.  We  ask : Along  what 
direction  l will  the  derivative  dujdl  be  greatest?  According  to  (3), 
this  question  reduces  to  the  following  one:  On  what  direction  is  the 
projection  of  the  vector  grad  u the  largest?  It  is  clear  that  any  vector 
projected  on  different  directions  yields  the  larger  projection,  equal  to 
its  length,  when  projected  on  its  own  direction.  Thus,  the  vector  grad 
u at  the  point  M indicates  the  direction  of  fastest  buildup  in  the  field 
u , and  this  fastest  rate,  referred  to  unit  length,  is  equal  to  | grad  u | ; 
the  faster  the  field  varies,  the  longer  grad  u is.  Fig.  1 15  shows  the  vec- 
tors of  the  temperature  gradient  at  different  points  of  a heat  conductor 
heated  from  within  (hatched  zone)  and  cooled  from  without.  The 
temperature  gradient  is  directed  towards  the  “heater”. 

The  resulting  physical  meaning  of  the  gradient  also  shows  that 
the  gradient  is  invariantly  related  to  the  field  at  hand,  that  is,  it 
remains  unchanged  (invariant)  under  a change  of  the  Cartesian  axes 
(this  was  not  apparent  from  the  definition  (1)  that  was  given  in  “non- 
variant” form,  i.e.  associated  with  a specific  system  of  coordinates). 
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What  is  more,  if  the  field  u is  given,  then  at  every  point  of  space  it  is 
possible  to  find  the  direction  and  rate  of  fastest  buildup  in  the  field  u ; 
thus  we  can  find  the  vector  grad  u without  resorting  to  coordinates  and 
to  specifying  u(x,  y,  z).  To  summarize,  the  gradient  of  a scalar  field 
forms  a very  definite  vector  field. 

The  same  requirement  of  invariance  is  demanded  of  all  the  other 
basic  notions  in  field  theory.  As  we  have  already  explained  in  Sec.  9.5, 
when  the  axes  of  a Cartesian  coordinate  system  are  changed,  the  vectors 
remain  unchanged  (invariant)  but  their  projections  change.  Thus,  if 
some  concept  in  the  theory  of  a vector  field  is  formulated  with  the 
aid  of  coordinates  and  projections  of  that  field,  this  concept  must  satisfy 
the  requirement  of  invariance  relative  to  any  change  of  the  coordinates 
and  their  projections  under  a rotation  of  the  coordinate  axes.  Such 
will  be  the  case,  in  particular,  if  the  statement  is  expressed  in  terms  of 
scalars,  vectors  or  tensors  (Sec.  9.5). 

Note  that  the  derivatives  u'xt  u'y,  u'z  are  also  directional  derivatives : 
for  example,  u'x  is  a derivative  in  the  direction  of  the  #-axis. 

The  gradient  of  the  field  u(x,  y,  z)  is  closely  related  to  the  level 
surfaces  of  the  field,  that  is,  the  surfaces  on  which  the  field  has  a con- 
stant value,  u ( x , y,  z)  ~ constant.  Depending  on  the  physical  meaning 
of  the  field  they  are  called  isothermic,  isobaric,  etc.  surfaces.  Namely, 
at  every  point  M the  gradient  of  the  field  is  normal  (that  is,  perpendi- 
cular to  the  tangent  plane)  to  the  level  surface  passing  through  M. 
Indeed  (Fig.  1 1 6) , if  AC  is  small,  then  the  surfaces  u — C and  u = C + 
+ AC  near  M may  be  regarded  as  nearly  plane  and 

du  A « AC 

Hi  ~ u~  to 


(4) 
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where  A l is  the  distance  between  the  surfaces  in  the  direction  L But 

it  is  clear  that  Al  will  be  smallest,  and  therefore  the  derivative  — 

dl 

largest,  if  l is  directed  along  the  normal  to  the  surface.  Whence  follows 
our  assertion. 

All  the  concepts  introduced  for  a spatial  field  carry  over  with 
appropriate  simplifications  to  flat  fields  (see  Sec.  10.1).  For  instance, 
the  gradient  of  the  field  u(x,  y)  computed  from  the  formula  grad  u = 

is  a vec{or  jn  the  ^-plane.  At  every  point  the  gradient 

dx  dy 

of  the  field  is  normal  to  the  level  line  of  the  field,  i.e.  the  line  u(x9  y)  = 
= constant  passing  through  this  point  (Fig.  117).  Here  it  is  evident  from 
formula  (4)  that  the  modulus  of  the  gradient  is  inversely  proportional 
to  the  distance  between  the  level  lines.  The  gradient  increases  where  the 
level  lines  come  close  together  (compare,  for  example,  the  gradient 
of  the  field  at  points  A and  B in  Fig.  117). 
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As  an  example,  let  us  compute  the  gradient  of  the  central- 
symmetric  field  u = f{r),  where  r = \'x2  + y2  + Here  the  value 
of  u depends  solely  on  the  distance  of  the  point  from  the  origin 
and  for  this  reason  the  level  surfaces  are  spheres  centred  at  the  origin 
of  coordinates.  If  we  take  two  spheres  whose  radii  differ  by  dr,  then 
the  values  of  the  function  / on  them  will  differ  by  df.  Therefore,  the 
rate  of  change  of  the  field  across  the  level  surfaces  (that  is,  along  the 

radii)  is  equal  to  — and  so 

dr 

grad  u{r)  = f - = (5) 

dr  r dr 


Here,  r is  the  radius  vector  drawn  from  the  origin  to  any  running 
point  (x,  y,  z) ; r = | r | ; and  r°  — — is  a vector  of  unit  length 

r 

along  r. 

Let  us  obtain  formula  (5)  in  a different  way,  relying  on  the  expres- 
sion of  the  gradient  in  coordinate  form  (1).  We  have 

du  df  dr  df  x df  x 

dx  dr  dx  dr  Y x2  + y2  + z2  dr  r 


Similarly,  = 

dy  dr  r 


du  df  z 

— = — — , whence 

dz  dr  r 


df  x . df  y . df  z , 
grad  « = - ‘t- J -r  — K 

0 n.v  v n.v  v n.y  v 


df  1 . . • 1 1 \ df  1 

f--(xi  + yj+zk) 

dr  r dr  r 


Since  — — is  a scalar,  we  see  that  for  dJ->  0 the  vectors 

r dr  dr 

grad  u everywhere  continue  the  corresponding  vectors  r (like  the 
spines  of  a rolled  up  hedgehog),  and  if  — < 0,  then  the  vectors 

dr 

grad  u are  all  directed  to  the  origin. 

Exercises 

1.  Find  the  derivative  of  the  field  u = xy  — z2  at  the  point 
M(2,  1,  —3)  along  the  direction  of  the  vector  a = i + 3k. 

k k 

2.  Find  the  gradient  of  the  field  u = — + — , where  kv  k2  = con- 

ri  *2 

stant,  and  rv  r2  are  the  distances  from  certain  fixed  points  Mv  M2. 

10.3  Potential  energy  and  force 

Suppose  a body  (one  of  very  small  size,  often  called  merely  a 
material  point  or  particle)  in  arbitrary  motion  in  space  is  acted  upon 
by  a force  F depending  solely  on  the  position  of  the  body.  In  other 
words,  at  every  point  M of  space  there  is  defined  an  appropriate  force 
vector  F = F(M),  which  thus  forms  a vector  field  of  force . As  the 
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body  moves,  this  force  performs  work  that  is  used  to  change  the 
kinetic  energy  of  the  body  or  to  overcome  the  resistance  of  certain 
external  forces. 

It  is  easy  to  calculate  the  work  performed  by  the  force  F over 
a specified  path  traversed  by  the  body.  For  an  infinitesimal  transla- 
tion dr,  the  force  may  be  considered  constant  and  therefore  the  corres- 
ponding work  (see  Sec.  9.2)  is  equal  to  the  scalar  product 

dA  = Y-  dr 

Summing  these  elementary  pieces  of  work,  we  get  the  total  work 
carried  out  by  the  force  F when  the  body  covers  a certain  path  ( L ), 

^=^F-ir  (6)' 

Such  an  integral  taken  along  the  line  ( L ) is  called  a line  integral . 
To  evaluate  it,  we  must  know  not  only  the  field  F and  the  line  ( L ), 
but  also  the  direction  in  which  this  line  is  traversed ; we  say  that  the 
line  ( L ) must  be  oriented.  If  ( L ) is  open,  all  we  need  to  know  is  which 
of  the  two  endpoints  is  regarded  as  the  initial  point  and  which  the 
terminal  point,  i.e.  we  have  to  indicate  the  limits  of  integration. 
This  orientation  is  essential,  for  if  we  traverse  the  curve  in  the  same 
field  in  the  opposite  direction,  then  dr  becomes  —dr  and  A becomes 
— A (the  work  changes  sign). 

The  expression  (6)  of  a line  integral  can  be  written  in  other 
useful  forms.  If  we  recall  that  dr  = x ds , where  x is  the  unit  vector  of 
the  tangent  to  (. L ) and  ds  is  the  differential  of  arc  length,  then  we  get 

A = ^ F • x ds  = ^ F?  ds 

(L)  [L) 

where  FT  denotes  the  projection  of  the  vector  F on  x,  that  is,  the 
“tangential”  projection  of  the  vector  F.  On  the  other  hand,  if  we 
write,  in  Cartesian  coordinates  x,  yt  z , 

F = Fxi  + Fy]  + Fz k,  r = xi  + yj  + zk 

whence  dr  = dxi  + dyj  + dzk  and  if  we  recall  the  expression  (5), 
Ch.  9,  for  a scalar  product  in  Cartesian  projections,  then  we  have 

A = (j  (Fz  dx  + Fy  dy  + Fz  dz) 

The  work  of  a force  can  also  be  expressed  as  an  integral  of  the 
time ; if  the  motion  of  a body  is  given,  that  is,  if  we  have  the  func- 
tion r(tf),  then  the  force  acting  on  the  body  is  also  a complex  function 
of  the  time: 


F = F(r)  = F[r  (t)]  = F (t) 
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The  work  of  a force  during  time  dt  can  be  expressed  in  terms  of  the 
velocity  v,  where  v = — ; since  dr  = v dt,  from  formula  (6)  we  get 


*ter 

A = ( F(/)  • v(*)  dt 


where /init  and^er  denote  the  initial  and  terminal  instants  of  motion. 

Thus,  the  quantity  F • v is  a scalar  product  of  the  force  into  the 
velocity,  it  is  equal  to  the  work  per  unit  time  and,  as  we  know,  is 
termed  the  power. 

In  this  section  we  consider  a force  dependent  solely  on  the  posi- 
tion of  the  body.  The  dependence  on  time  of  the  force  acting  on  the 
body  is  a result  of  the  translation  of  the  body.  In  this  case  we  can 
say  that  the  work  A during  translation  over  a given  path  is  actually 
independent  of  the  velocity  of  the  body,  does  not  depend  on  the  form 
of  the  functions  v(/)  and  r(tf),  but  depends  only  on  the  trajectory 
(path)  over  which  the  body  moves,  despite  the  fact  that  v(t)  enters 
explicitly  as  a factor  in  the  integrand  for  A.  The  point  is  that  r(^) 
also  varies  when  the  function  v(t)  varies.  Hence  for  a given  F = F(r)  = 
= F(x,  y,  z)  the  aspect  of  F(t)  likewise  changes.  So  also  do  the  limits 
of  integration  t\u\ t and  /ter  that  correspond  to  the  given  initial  and 
terminal  points  of  the  path: 

r(^init)  ==:  r(/tcr)  ==  ^*ter 

For  a large  number  of  physical  fields  of  force  F it  turns  out  that 
if  the  body  traverses  a closed  contour,  the  overall  work  performed 
by  the  force  F is  equal  to  zero ; in  other  words,  if  the  force  performs 
a certain  amount  of  work  on  one  section  of  the  contour,  then  on  the 
remaining  portion  it  moves  against  the  action  of  the  force  and  returns 
the  accumulated  energy  to  the  field.  Such  is  the  behaviour  of  a gravi- 
tational field,  an  electrostatic  field  and  many  other  fields.  Mathemati- 
cally, this  means  that 

^F-rfr  = 0 (7) 

(L) 

for  any  closed  curve  (L),  where  ^ is  to  be  understood  as  a line  integral 

around  a closed  curve.  The  term  circulation  is  also  used  for  such  an 
integral.  Thus,  we  assume  that  the  circulation  of  the  force  F around 
any  closed  path  is  equal  to  zero. 

The  assumption  (7)  can  be  stated  equivalently  thus:  the  work 
(6)  of  a force  F along  any  open  curve  (L)  depends  only  on  the  position 
of  the  beginning  and  end  of  the  curve  but  does  not  depend  on  the 
shape  of  the  curve.  Indeed,  suppose  condition  (7)  is  fulfilled,  and  the 
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curves  (LJ  and  (L2)  have  a common  beginning  M and  end  N,  and 
suppose  the  work  of  the  force  F along  (Z,x)  is  equal  to  A1  and  along 
(L2)  is  equal  to  A2.  We  form  a closed  contour,  going  from  M to  N 
along  (i1)  and  then  from  N to  M along  (L2).  Then,  by  the  foregoing, 
the  work  of  the  force  F along  such  a contour  is  equal  to  — A2. 
On  the  other  hand,  by  virtue  of  (7)  we  get  Ax  — A2  ~ 0,  whence 
A2  = A2.  We  leave  it  to  the  reader  to  prove  in  similar  fashion  the 
converse:  that  from  the  last  equation  follows  condition  (7). 

We  can  introduce  into  the  assumption  (7)  the  concept  of  potential 
energy  of  a body;  that  is,  the  energy  that  depends  on  the  position 
of  the  body  (this  notion  is  discussed  in  detail  for  one-dimensional 
motion  in  HM,  Ch.  6).  Namely,  by  definition  the  value  of  the  poten- 
tial energy  U at  any  point  M is  equal  to  the  work  performed  by  the 
force  F when  the  body  is  moved  from  M to  a fixed  point  M0 . (In 
concrete  problems,  the  point  M0  is  frequently  chosen  at  infinity,  so 
that  one  speaks  of  the  work  performed  when  a body  is  carried  to 
infinity;  for  this  purpose,  the  work  must  be  finite  and  must  not 
depend  on  the  mode  of  such  a recession  to  infinity.)  In  other  words 

U (M)  = ( F-  dr  (8) 

MMt 

Here  the  choice  of  a particular  path  from  M to  M0  is  inessential. 

Using  the  potential  energy,  it  is  easy  to  express  the  work  per- 
formed by  F when  the  body  is  moved  from  any  point  M 2 to  any  other 
point  M2 . Precisely,  for  if  in  such  a movement  we  pass  through  M0 
(this  is  possible  since  the  choice  of  path  has  no  effect  on  the  work) 
and  if  we  consider  the  work  on  the  sections  from  Mx  to  MQ  and 
from  M0  to  M2,  we  find  that  the  work  is  equal  to  U(M 2)  — U(M2). 
Thus,  the  work  performed  by  a field  is  equal  to  the  reduction  in  poten- 
tial energy  of  the  body. 

Choice  of  the  point  M0>  i.e.  the  point  at  which  the  potential 
energy  is  set  equal  to  zero,  in  determining  the  potential  energy  is 
rather  arbitrary.  Let  us  see  what  happens  if  M0  is  replaced  by  another 
point  M0.  Since  the  work  U(M)  performed  in  moving  from  M to 
M0  is  equal  to  the  work  U(M)  performed  in  moving  from  M to  M0 
plus  the  work  performed  in  moving  from  M0  to  M0t  it  follows  that 
U(M)  differs  from  U(M)  by  an  additive  constant.  Thus,  the  poten- 
tial energy  is  determined  to  within  an  additive  constant.  However, 
this  arbitrariness  does  not  affect  the  difference  of  the  values  of  poten- 
tial energy  at  the  two  points,  that  is,  it  disappears  when  we  compute 
the  work  performed  in  moving  from  one  point  to  the  other. 

We  have  proved  the  existence  of  potential  energy  in  a field  of 
force  (that  is,  the  potentiality  of  this  field)  by  proceeding  from  condi- 
tion (7)  that  the  work  performed  around  a closed  path  is  equal  to 
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zero.  It  is  easy  to  verify  that,  conversely,  if  the  potential  energy  exists, 
then  in  calculating  the  work  performed  by  a field  during  a transla- 
tion of  a body  from  point  M to  point  N,  only  the  position  of  the 
points  is  essential  but  not  the  path,  which  plays  no  role  at  all  since 
this  work  is  equal  to  U(M)  — U(N).  But  we  have  seen  that  this  is 
equivalent  to  the  condition  (7). 

Formula  (8)  defines  the  potential  energy  if  a field  of  force  is 
given.  Let  us  derive  the  inverse  formula  that  expresses  the  field  in 
terms  of  the  potential  energy.  Assume  that  the  body  performs  an 
infinitesimal  translation  dr  = dxi  + dy j + dz  k.  Then  the  force 
F = FJ i + Fy j + Fz k performs  the  elementary  work 

dA  = F • dr  = Fz  dx  + Fy  dy  + Fz  dz 
On  the  other  hand,  this  work  is  equal  to 


dA  = - dU  = - 


Comparing  both  formulas,  we  see  that 

^ dU  - dU  - dU 

r r ~ — • — ■ , r ..  — — - — - , r „ = 

dx  dy  dz 

whence 

F = Fxi  + Fj+FIk=-^i-^j-^k=-grad  V (9) 

dx  dy  dz 

(see  Sec.  10.2). 

Let  a particle  of  mass  m be  in  motion  under  the  action  of  a given 
force  field  F = — grad  U.  Then,  by  Sec.  9A,  the  increment  in  kine- 
tic energy  is 


and  so 


mv\ 


^ = ji7,*  = -S^ds  = U1-U2 

Sx 


+ u2  = + Ux 


Thus,  the  total  energy  of  the  particle, 


E = 


remains  constant  throughout  the  motion.  We  see  that  the  total  energy 
is  the  sum  of  the  kinetic  and  the  potential  energy.  (For  one-dimen- 
sional motion  a similar  discussion  is  given  in  HM,  Sec.  6.8.) 

To  illustrate,  let  us  take  the  case  of  attraction  via  Newton's 
law,  that  is,  the  case  where  a body  is  acted  upon  by  a force  directed 
to  a certain  fixed  point  0 (which  we  take  for  the  coordinate  origin) 
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and  is  inversely  proportional  to  the  square  of  the  distance  from  this 
point.  Since  the  vector  — — is  directed  towards  the  origin  and  has 

r 

unit  length,  the  force  field  is  of  the  form 


(10) 


where  k is  a constant  of  proportionality.  By  virtue  of  central  symme- 
try of  the  force,  it  is  easy  to  see*  that  the  potential  energy  is  also  of 
the  form  U = f(r).  But  then,  by  formulas  (9)  and  (5),  we  get 

F = - grad  U = - ^ - 

dr  r 

Comparing  this  with  (10),  we  get 

— = — , whence  U = f(r)  = — — (11) 

dr  r 2 r 


to  within  an  arbitrary  additive  constant.  Thus  a Newtonian  field  of 
force  is  a potential  field,  and  its  potential  is  determined  from  formula 
(11).  This  formula  gives  the  potential  normalized  (: normalization  is 
defined  as  a choice  of  one  of  a number  of  equivalent  entities)  by 
the  condition  of  vanishing  at  infinity. 

A problem  sometimes  arises  of  constructing  vector  lines  of  a 
force  field  F or,  as  we  say  for  short,  the  lines  of  force  of  a field  F. 
These  are  lines  which  at  each  point  are  in  the  direction  of  the  field, 
that  is  to  say,  such  that  touch  the  vector  of  the  field  F at  every  point. 
The  problem  of  constructing  such  lines  readily  reduces  to  the  problem 
of  integrating  a system  of  differential  equations : first  introduce  the 
Cartesian  coordinates  x , y,  z and  then  write  the  condition  of  paral- 
lelism of  the  vectors: 

dr  = dxi  + dy  j f-  dz  k 

and 

F = Fx(x,  y,  z)  i + Fv(x,  y,  z)  j + Fz(x,  y,  z)  k 
(see  end  of  Sec.  9.1): 

dx  dy  dz 

Fx(*.  y,  z)  Fy{x,  y,  z)  Fz(x,  y,  z) 

Here  we  use  the  fact  that  the  vector  dx  lies  precisely  along  the  tangent 
to  the  curve  (see  Sec.  9.4).  Equations  (12)  form  a system  of  differen- 
tial equations  of  the  vector  lines  of  the  field  F written  in  “symmetric 
form”,  in  which  all  three  coordinates  are  of  equal  status.  From  this 

it  is  easy  to  express  — and  — and  go  over  to  a system  of  two 

dx  dx 


Consider  the  translation  of  a body  over  a sphere  centred  at  0 when  r = constant. 
The  work  of  the  force  is  then  equal  to  zero. 
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differential  equations  in  y(x)  and  z(x)  written  in  the  ordinary  form 
(Sec.  8.2). 

As  is  clear  from  the  geometric  meaning  (and  as  has  been  discus- 
sed in  detail  in  the  theory  of  differential  equations),  exactly  one 
line  of  force  passes  through  each  point.  Thus  the  entire  space  (or  the 
portion  of  it  in  which  the  field  of  force  is  specified)  is  filled  with  lines 
of  force. 

For  a Newtonian  field,  the  lines  of  force  are  rays  emanating  from 
the  centre  of  attraction,  i.e.  from  point  0.  They  do  not  intersect 
anywhere  except  at  O itself,  which  is  a singular  point  of  the  system 
of  differential  equations  (cf.  end  of  Sec.  10.2). 

In  Sec.  10.2  we  saw  that  at  every  point  the  vector  grad  U is 
normal  to  the  surface  U — constant  passing  through  that  point. 
It  therefore  follows  from  (9)  that  the  lines  of  force  at  every  point  are 
normal  to  the  equipotential  surfaces.  This,  incidentally,  is  clear  from 
the  fact  that  the  greatest  work  is  performed  per  unit  length  of  path 
if  the  body  is  moving  normally  to  the  surfaces  of  equal  potential, 
and,  on  the  other  hand,  the  same  occurs  if  the  body  is  in  motion  along 
the  lines  of  force. 


Exercise 

Let  a flat  field  of  force  have  a potential  U = oexy  (a  — constant). 
Find  the  field  itself  and  its  lines  of  force. 
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Consider  the  stationary  flow  of  a fluid.  At  any  point  M the 
velocity  of  a particle  of  the  fluid  has  a definite  value  v = \{M) ; 
we  have  a vector  velocity  field. 

Consider  the  trajectory  (L)  of  a particle  (this  is  called  a flow 
line).  It  is  known  (see,  for  instance,  Sec.  9.4)  that  at  each  point  of 
(L)  the  velocity  vector  v is  tangent  to  ( L ).  Hence,  for  the  velocity 
field  the  flow  lines  serve  as  vector  lines  (Sec.  10.3).  As  in  (12),  the 
differential  equations  for  them  can  readily  be  obtained  directly  if 
we  express  dt  in  terms  of  the  equations 


dx 

dt 


dy 
— = V, 
dt 


v* 


dz 

It 


= v9 


and  equate  the  results. 

Let  us  now  investigate  something  that  will  be  of  importance 
later  on : the  concept  of  the  flux  of  a vector  through  a surface.  Suppose 
we  have  chosen  in  space  a closed  or  open  “oriented”  surface  (ct).  A 
surface  is  said  to  be  oriented  if  we  indicate  which  side  is  the  inside  and 
which  the  outside.  Such  an  orientation  can  be  done  in  two  ways 
(Fig.  118). 

Now  let  us  calculate  the  volume  of  fluid  carried  outwards  through 
(cr)  in  unit  time.  First  find  the  volume  carried  during  a small  time  dt 
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Fig.  119 


through  an  element  of  surface  (da)  (Fig.  119).  This  is  the  volume  of  an 
oblique  cylinder  with  base  da  and  generatrix  v dt.  The  altitude  of  the 
cylinder  is  equal  to  the  projection  of  the  generatrix  on  the  perpendicu- 
lar to  the  base,  i.  e.  it  is  equal  to  vn  dt,  where  vn  is  the  projection  of 
the  velocity  vector  v on  the  unit  vector  n of  the  outer  (directed 
outwards)  normal  to  the  surface.  Thus,  the  volume  of  the  cylinder 
is  equal  to  vn  dt  da. 

If  in  time  dt  there  passes  through  the  surface  element  (da)  a 
volume  vn  dt  da,  then  in  unit  time  a volume  vn  da  will  pass  through  the 
same  element.  Summing  these  volumes  for  all  elements,  we  find  that 
in  unit  time  the  following  volume  will  pass  outward  through  the  entire 
surface  (a) : 

Q =^vndv  (13) 

(o) 

Such  an  integral  of  the  normal  projection  is  called  the  flux  of  the 
vector  v through  the  surface  (a).* 


The  integral  (13)  is  a special  case  of  the  surface  integral  discussed  in  Sec.  4.7. 
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In  hydrodynamics,  besides  the  velocity  field  v one  frequently 
makes  use  of  the  field  of  “mass  velocity"  pv,  where  p is  the  density 
of  the  gas  at  a given  point.  Reasoning  exactly  as  before,  we  can  easily 
prove  that  the  flux 

(o) 

of  this  vector  through  any  oriented  surface  (a)  is  equal  to  the  mass 
(and  not  the  volume,  as  earlier!)  of  the  gas  carried  outwards  in  unit 
time  through  (a). 

Clearly,  the  flux  is  a scalar  quantity.  It  is  essentially  dependent 
on  the  orientation  of  the  surface  (a) : if  the  orientation  is  reversed, 
this  reverses  the  sign  of  vn  and  also  of  Q in  formula  (13).  (Incidentally, 
this  is  also  clear  from  the  meaning  of  the  integral  (13)  that  we  gave.) 
If  the  surface  (a)  is  located  so  that  the  outgoing  flow  lines  intersect 
it  everywhere,  then  Q > 0,  and  if  the  flow  lines  are  inward,  then 
Q < 0.  Now  if  the  flow  lines  partially  intersect  (a)  outwardly  and 
partially  inwardly,  then  the  flux  is  equal  to  the  sum  of  the  positive 
and  negative  quantities  (which  ones?)  and  may  be  positive,  negative, 
or  equal  to  zero.  The  flux  of  the  velocity  vector  through  a surface 
completely  filled  with  flow  lines  (that  is,  through  every  point  of  which 
a flow  line  passes  that  lies  entirely  on  this  surface)  is  always  zero.* 

The  flux  (13)  is  sometimes  written  in  a different  form.  If  an  orient- 
ed plane  area  (a)  is  given,  then  it  is  customary  to  associate  with  it 
a vector  a directed  perpendicularly  to  (a)  from  the  inside  to  the  out- 
side (Fig.  120),  the  modulus  of  the  vector  being  taken  equal  to  the 
area  (a).  This  is  done  when  the  only  essential  things  are  the  area 
(a)  and  its  direction  in  space  while  the  specific  form  of  (a)  (i.e.  whether 
it  is  a circle  or  a rectangle,  etc.)  is  immaterial.  In  this  notation, 


More  pictorially,  this  is  a surface  along  which  a gas  slides. 
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the  vector  da  will  lie  along  n (see  Fig.  119)  and  we  can  write 
vn  da  = v * da  (see  formula  (3)  of  Ch.  9).  Thus 

Q = ivnda  = Cv  da  (H) 

(O)  (o) 

The  flux  of  vector  v through  the  surface  (a)  is  also  called 
the  number  of  outward  vector  lines  (flow  lines)  cutting  (a).  This  is 
a mere  convention  because  the  indicated  number  has  dimensions  and 
as  a rule  is  fractional,  but  it  is  widely  used  because  of  its  pictorial 
nature.  Bear  in  mind  that  this  "number”  is  to  be  understood  alge- 
braically, so  that  if  a portion  of  (<y)  is  cut  by  outward  lines  and  an- 
other portion  is  cut  by  inward  lines,  the  result  can  have  any  sign  (and 
even  be  zero)  depending  on  which  portion  is  cut  by  the  larger 
number  of  lines. 

Here  are  some  simple  examples  of  computing  flux.  First  let  the 
field  v be  constant  (homogeneous,  i.e.  the  same  at  all  points)  and  let 
the  surface  (a)  be  flat.  Then  from  (14)  it  follows  that 

Q = vna  = vo  (15) 

Let  the  field  v be  proportional  to  the  radius  vector  r,  that  is, 
v = kx  where  k is  a constant  of  proportionality,  and  let  (a)  be  a 
sphere  with  centre  at  the  origin  and  oriented  in  a natural  fashion 
((the  inside  facing  the  origin).  Then  vn  — vr  = kr,  and  since 
r = constant  on  (a),  it  follows  that 

Q = ^kr  da  — kr  ^ da  = kra  = 4 Tzkr* 

(o)  (a) 

Finally,  consider  a general  central-symmetric  field  in  space 
defined  by 

v =f(r)  r°  |r°  = (16) 

where  r is  the  radius  vector  of  the  running  point  and  r is  its  length, 
so  that  r°  is  a unit  vector  along  the  radius  vector  indicating  the  direc- 
tion of  the  field.  Then  the  flux  of  the  field  through  a sphere  of  radius 
r with  centre  at  the  origin  is  equal  to 

Q = Q(r)  =^vnda=  ^ f(r ) da  = f(r)  Arzr2  (17) 

An  interesting  corollary  follows  from  (15).  Consider  a closed 
polyhedron  (Fig.  121) ; orient  its  faces  so  that  one  of  them  ((a)  in  Fig. 
121)  is  the  closing  face  relative  to  the  collection  of  other  faces.  Then 
it  is  clear  that  if  this  polyhedron  is  imagined  to  be  in  a homogeneous 
field  of  velocity  v,  the  flux  through  (cr)  is  equal  to  the  algebraic  sum 
of  the  fluxes  through  the  other  faces:  Q = Q1  + Q2  + Qs>  whence 

V • a = v • ffj  + v • a,  + v • <r3  = v • (<7X  + <r,  + «3) 
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l*'ifv  121 


and,  because  of  the  arbitrary  nature  of  v, 

G = <Ti  + <T2  + <J3  (18) 

To  summarise,  then,  the  vector  of  the  closing  area  is  equal  to  the 
sum  of  the  vectors  of  the  “component"  areas,  which  is  to  say  that 
the  areas  can  be  added  like  vectors.  We  made  essential  use  of  the 
fact  that  the  areas  are  represented  by  vectors  precisely  as  shown 
in  Fig.  120,  which  is  a strong  argument  in  favour  of  this  represen- 
tation. 

This  question  can  be  approached  from  another  angle.  Orient 
all  the  faces  of  the  polyhedron  in  Fig.  121  in  natural  fashion,  so  that 
the  vectors  and  a point  outwards.  Then  in  (18)  substitute  — a, 
for  <r,  and  transpose  all  terms  to  the  left-hand  side  to  get  <x1  + <r2  + 
+ <r3  + a = 0.  This  relation  is  fully  analogous  to  the  familiar  (from 
vector  algebra)  rule  on  the  sum  of  the  vectors  forming  a closed  polygon 
being  equal  to  zero  (see  Sec.  9.1).  This  also  justifies  representing  the 
areas  by  vectors. 

In  the  limit,  we  can  obtain  from  a polygon  any  closed  curve 
(L)  and  from  a polyhedron,  any  closed  surface  (<j)  ; we  thus  arrive 

at  the  equations  =0,  = 0 (the  first  one  is  obvious),  where 

m (o) 

the  circle  on  the  integral  sign  (the  circle  can  be  dropped)  stresses 
the  fact  that  the  integral  is  taken  around  a closed  curve  or  surface. 

Exercise 

Consider  the  general  axial-symmetric  field  in  space  defined  by 

v = y(j  p |,  z)  p°  (p°  = — — I , where  p is  a vector  from  the  point 
l I p[  / 

(0,  0,  z)  to  the  point  (x,  y,  z).  What  is  the  flux  of  this  field  through 
the  surface  of  a right  circular  cylinder  of  radius  | p | with  the 
£-axis?  What  is  the  flux  in  the  special  case  of  a plane-parallel 
field? 
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10.5  Electrostatic  field,  its  potential  and  flux 

It  is  a well-known  fact  that  a charge  placed  in  space  generates 
an  electric  field  that  can  be  analyzed  on  the  basis  of  Coulomb's  law. 
For  the  sake  of  simplicity,  we  consider  space  empty  (a  vacuum)  and 
first  examine  the  field  of  a charge  q placed  at  the  origin  of  coordinates 
0.  To  investigate  the  field,  we  can  place  at  an  arbitrary  point  M 
a trial  unit  charge  and  see  with  what  force  the  field  acts  on  it.  This 
force  E is  termed  the  intensity  of  the  electric  field.  Since  at  dif- 
ferent points  M the  intensity  £ is  different,  we  have  a vector  field 
E = E (M),  which  is  actually  a special  case  of  the  force  field  considered 
in  Sec.  10.3  (if  all  the  time  we  have  in  view  the  action  on  a trial  unit 
charge) . The  mathematical  statement  of  Coulomb's  law  is  similar  to 
that  of  Newton's  law  considered  in  Sec.  10.3.  For  a point  charge  q 
it  is  of  the  form 

E = — r°  = — r 
r 2 r3 

|r°  = --  is  a unit  vector  along  the  radius  vector j.  This  formula  is 

somewhat  different  from  (10)  since  the  force  must  be  directly 
proportional  to  q and,  besides,  for  q > 0 a trial  unit  positive  charge 
is  not  attracted  but  repulsed.  The  coefficient  k in  the  last  formula 
depends  on  the  system  of  units  chosen,  and  for  simplicity  wc  set  it 
equal  to  unity.*  Thus  we  write 

E = — r°  = — r (19) 

yZ 


In  Sec.  10.3  we  saw  that  such  a field  has  a potential 


<P  (AQ=- 

Y 


<1 

OM 


which  is  connected  with  the  field  E by  the  relation 

E = — grad  9 (20) 

and  is  normalized  by  the  condition  that  it  vanishes  at  infinity.  By 
virtue  of  the  homogeneity  of  space,  the  same  kind  of  charge  q placed 
at  the  point  M0(x0,  y0t  z0)  generates  a potential  (we  write  o(r)  in- 
stead of  9 (M),  regarding  r as  the  radius  vector  of  the  point  M) : 

?(r)  = 7 — 1 — 7 (21) 

1 1 ~ '0 1 

where  r0  is  the  radius  vector  of  the  point  M0** 

Now  if  we  have  a system  of  charges  in  space,  the  electric  fields 
E associated  with  these  charges  combine.  (This  important  fact,  which 


* This  means  that  E and  q are  expressed  in  the  electrostatic  system  of  units 
CGSE. 

**  Do  not  confuse  the  radius  vector  r0  with  the  unit  vector  r°. 
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■.jellifies  the  independence  of  the  separate  charges,  has  been  established 
experimentally.)  But  from  formula  (20)  it  is  easy  to  verify  that  the 
corresponding  potentials  are  also  additive.  Indeed,  if  E = Ex  -j-  E2, 
where 


E*  = — grad  <?L  = — 


(hen,  adding  like  projections,  we  get 

E = — ±?ij  + 3J*1  ± kl  = — grad  (?1  + ?2) 

L dx  dy  dz  J 


which  is  to  say  that  9 = cpL  -f  92  serves  as  the  potential  of  the  field  E. 

We  see  that  an  electric  field-  formed  by  any  system  of  charges  at 
rest  possesses  a potential  and,  besides,  it  is  found  that  in  studying 
the  dependence  of  the  potential  on  the  field  the  superposition  principle 
applies  (the  law  of  linearity,  see  Sec.  6.2).  This  makes  it  possible  to 
construct,  via  a general  method,  the  potential  of  an  electric  field 
resulting  from  charges  being  distributed  in  space  with  a certain  va- 
riable density  p.  If  the  density  has  the  aspect  of  a spatial  delta  func- 
tion (see  Sec.  6.3) ; that  is,  if 

P(r)  = 5(r-r0)  (22) 

then  the  corresponding  charge  is  a unit  charge  concentrated  at  a 
point  with  radius  vector  r0.  For  this  reason,  the  potential  correspond- 
ing to  the  density  (22)  — it  is  the  Green's  function  in  the  given 
problem  — is  computed  from  formula  (21)  with  q = 1 ; thus, 


Here,  r0  determines  the  point  of  action,  and  r the  point  of  observation. 

Knowing  the  Green's  function,  it  is  easy,  by  the  general  method 
of  Sec.  6.2,  to  obtain  an  expression  for  the  potential  in  the  case  of 
an  arbitrary  density  p(r): 

\ G(r,  r0)  p(r0)  d£l0  = ( dQ0 

J J G-r„ 

(O0)  (n0) 

where  the  integration  is  over  the  entire  region  (Q0)  occupied  by  the 
charge.  In  coordinate  form  this  is  written  thus: 


<?(*.  y> z)  = 


p(*o.  yo»  2o) 


(n.) 


V (x  - xq)2  + (y  — yo)2  + (z  — ‘ 


dx0  dyQ  dz0  (23) 


Examples  in  computing  potential  will  be  considered  below  in 
Sec.  10.6. 

Now  let  us  consider  the  flux  of  vector  E through  a closed  surface 
(a)  oriented  in  “natural  fashion”:  so  that  the  inside  faces  a finite 
region  bounded  by  (a)  and  the  outside  faces  infinity.  In  Sec.  10.4  we 
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said  that  such  a flux  is  also  called  the  number  of  vector  lines  (i.e.,  for 
a field  E,  the  “electric  lines  of  force”)  cutting  (a)  in  the  outward 
direction.  Since  this  number  is  taken  in  the  algebraic  meaning,  that 
is,  the  difference  between  the  number  of  emanating  lines  and  the 
number  of  entering  lines,  it  can  be  understood  as  the  number  of  lines 
originating  inside  (a). 

In  the  simplest  case,  where  E is  generated  by  a point  charge  q 
located  at  the  coordinate  origin  and  (a)  is  a sphere  with  centre  at 
the  origin  with  radius  R , the  flux  can  readily  be  found : on  (a) 


since  r°  on  a sphere  coincides  with  the  unit  vector  of  the  outer 
normal  n.  But  then,  on  the  sphere,  En  = = constant  and,  by 

definition  (cf.  formula  (13)),  the  flux  is 

\Enda=  = = (24) 

(o)  (o)  (o) 

Thus,  the  flux  at  hand  does  not  depend  on  the  radius  R and  is  directly 
proportional  to  the  charge. 

As  in  the  case  of  (24),  proof  can  be  given  that  the  flux  of  the  vec- 
tor E through  a portion  of  a sphere  with  centre  in  a point  charge 
(Fig.  122)  is  equal  to  to^,  where  <o  is  the  corresponding  solid  angle 
(the  corresponding  area  on  a sphere  of  unit  radius).  But  from  this 
it  follows  immediately  that  for  a field  generated  by  a point  charge 
the  electric  lines  of  force  cannot  begin  or  terminate  in  any  volume 
that  does  not  contain  the  charge  (i.  e.  they  begin  or  terminate  only  on 
that  charge).  Indeed,  choosing  a small  volume  (rf£2)  as  shown  in 
Fig.  122,  it  is  easy  to  see  that  just  as  many  lines  emerge  through  its 
spherical  walls  as  enter,  but  through  the  conical  wall  the  lines  do 
not  pass  at  all. 


See.  10.6  Examples  359 

Thus,  for  any  surface  (a)  containing  a point  charge  q within  it, 
the  number  of  lines  of  force  originating  inside  (<r)  (i.e.,  the  flux 
of  the  vector  E through  (cr))  is  equal  to  Anq.  In  other  words,  for  the 
field  at  hand,  the  lines  of  force,  for  q > 0,  originate  on  the  charge 
(their  quantity  is  proportional  to  the  charge)  and  go  off  to  infinity. 
If  the  charge  is  negative,  the  lines  of  force  originate  at  infinity  and 
terminate  on  the  charge. 

Now  suppose  that  there  are  several  charges,  say  qv  q2  and  q2> 
inside  the  closed  surface  (cr).  Then  the  associated  fields  and,  for  this 
reason,  the  fluxes  are  additive,  which  means  the  overall  flux  through 
(a)  is  equal  to  Anq1  + Anq2  + Anqz  = Aiz(q1  + q2  + q3)  = Anq, 
where  q = q1  + q2  + q3  is  the  total  charge  located  inside  (a). 

To  summarize:  the  number  of  electric  lines  of  force  originating 
in  any  volume  of  space  is  proportional  (with  proportionality  constant 
An)  to  the  total  charge  contained  in  the  volume.  This  assertion  is., 
known  as  Gauss  theorem . 

10.6  Examples 

The  field  formed  by  a point  charge  is  the  simplest  tjq>e  of  electric 
field.  More  complicated  fields  result  from  a combination  of  fields  of 
point  charges  located  at  distinct  points. 

In  many  important  cases,  a field  can  be  constructed  by  direct 
application  of  the  Gauss  theorem  (see  end  of  Sec.  10.5).  Let  us  consi- 
der, say,  a field  created  by  a sphere  of  radius  R0  charged  with  a con- 
stant volume  density  p. 

If  the  centre  of  the  sphere  is  taken  as  the  origin,  then  by  virtue 
of  the  spherical  symmetry  of  the  problem  it  is  clear  that  the  intensity 
E of  the  field  has  the  form 

E = F(r)  r°  (25) 

where  F(r)  is  a certain  scalar  function.  If  by  Gauss'  theorem  we 
equate  the  vector  flux  computed  from  formula  (17),  through  a sphere 
of  radius  r to  the  charge,  inside  the  sphere,  multiplied  by  An,  then  we 
get 

(r  < R0) , 

F(r)  • Any2  = ^ 

4n  j nR$p  = Anq  (r  > R0) 

where  q is  the  total  charge  of  the  sphere.  From  this,  finding  F(r)  and 
substituting  into  (25),  we  get 

rr°  = ~ npr  (inside  of  sphere), 

E = 


(outside  of  sphere) 
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Fig.  123 


In  particular  (cf.  formula  (19)),  outside  the  sphere  the  field  inten- 
sity is  exactly  as  if  the  entire  charge  were  concentrated  at  the  centre 
of  the  sphere.  The  graph  of  the  modulus  of  the  vector  E as  a function 
of  the  radius  is  shown  in  Fig.  123  by  the  solid  line. 

By  symmetry,  the  corresponding  potential  is  of  the  form  U = U(r ). 
Recalling  formula  (5),  we  find  that 

d'  j-4  oo) 

whence 

~ 7Z?r~  ~\-  C1  (0  < r < R0) , 

17  = I 3 

— + C2  (R0  ^ r < °°) 

r 

where  Cv  C2  are  constants.  Normalizing  the  potential  by  the  condition 
U(oo)  — 0,  we  get  C2  = 0 ; and  for  U(r)  to  be  continuous  for  r = R0, 
it  must  be  true  that 


^ 7?  2 ' C Q 

~ — ~r  '-'l  — — > 

J Aq 

that  is 

c-=i+!' 

And  so,  finally. 

— ~r  ~?(R*  — 

U = | 

iC0 

(0  < r < Ii0), 

Q_ 

r 

(R0^.r  < oo) 
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Now  let  us  consider  the  field  of  a charge  distributed  with  constant 
■ |i*nsity  a-  between  two  concentric  spheres  of  radii  Rx  and  R0.  By 
virtue  of  the  principle  of  superposition,  this  field  may  be  obtained 
.is  the  difference  of  fields  obtained  in  a distribution  of  charge  of  den- 
sity p in  a sphere  of  radius  R0  and  in  a sphere  of  radius  Rv  The  graphs 
of  both  fields  are  shown  in  Fig.  123.  We  see  that  inside  the  cavity 
(i.e.,  inside  the  sphere  of  radius  the  difference  is  zero,  which  means 
(here  is  no  electric  field;  in  other  words,  the  potential  is  constant. 
Outside  the  sphere  of  radius  R0  the  field  is  of  the  form 

<7o j.o 9i  j.o  ?o  ?i  j.o  ? j.o 

An r2  Anr2  Anr2  Any2 

Again,  the  field  is  as  if  the  entire  charge  were  concentrated  in  the 
centre.  The  graph  of  a field  for  a hollow  charge  is  shown  in  Fig.  124. 
These  same  conclusions  hold  true  for  infinitely  close  R±  and  R0 , 
which  is  to  say  for  uniform  distribution  of  charge  on  the  sphere. 

For  the  next  example,  we  consider  a charge  distributed  with 
constant  linear  density  tu  along  a finite  rectilinear  string  which  will 
be  taken  for  the  z- axis.  This  is  a plane-parallel  field  (Sec.  10.1)  and 
it  can  be  obtained  by  summing  (integrating)  the  fields  due  to  small 
charges  located  along  the  string.  But  it  is  simpler  to  use  the  Gauss 
theorem.  Note  that  by  virtue  of  the  axial  symmetry  of  the  problem, 
the  field  should  have  the  form 

E = <?(  | p |)  p° 

where  p is  a vector  perpendicular  to  the  string  and  directed  away 
from  it  to  the  running  point  (if  the  string  is  along  the  z-axis, 
then  p = xi  + yj) ; p°  = p/ 1 p | is  the  unit  vector  (do  not  confuse 
the  vector  p with  the  volume  density  of  charge)  and  cp(|  p | ) is  a 
certain  scalar  function.  Applying  the  "Gauss  theorem  to  a cylinder  of 
altitude  h and  radius  |p|  shown  in  Fig.  125,  we  get 

o(|  p | ) 2tt  | p ] h = lizuh 
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(here  it  must  be  taken  into  account  that  the  vector  flux  E through 
the  bases  of  the  cylinder  is  equal  to  zero,  since  the  electric  lines  of 

force  lie  along  these  bases).  From  this  we  have  <p(|  p |)  = , or 

I P I 

E = ~~~  P°  (26) 

I P I 


In  Fig.  125  are  shown  the  corresponding  lines  of  force,  which  (in  the 
case  jjl  > 0)  originate  on  the  charged  string  and  diverge  to  infinity. 

An  interesting  difficulty  arises  in  computing  the  potential  of 
the  field  under  consideration.  We  might  attempt  to  sum  the  potentials 
of  the  elementary  charges  of  the  string.  By  (21)  this  would  yield,  at 
some  point  M (x,  y,  z), 

oo  ro 


^ dX, [LdZ 

! Xi  + yj  -f  2k  - £k  j J Vx2  + y2  + (z  - 'Q2 


(27) 


— ao 


— oo 


But  for  large  | £ | the  denominator  behaves  like  j/  £2  = | C|  and  for 
this  reason  the  integral  (27)  diverges  to  infinity  (see  Sec.  3.3), 
which  means  it  has  an  infinite  value ! Clearly  this  is  no  way  to  find 
the  potential. 

But  recall  that  we  are  always  interested  not  in  the  value  itself 
of  the  potential  but  in  the  derivatives  of  it  (for  finding  the  field  E) 
or  in  the  difference  of  its  values  for  two  points.  Therefore  the  situation 
is  exactly  as  described  in  Sec.  3.6  for  a divergent  integral  depending 
on  a parameter  (here  we  have  three  parameters:  x,  y,  z).  Also  note 
that  the  string  is  always  finite,  so  instead  of  (26)  we  have  to  consider 
the  integral  v 




yx2  + y2  + {z  - £)2 
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for  very  large  N.  This  integral  can  conveniently  be  split  into  two: 

2 N 

^ ^ ; then  in  the  first  substitute  ^ for  ^ — z,  and  in  the  second  in- 

—V  z 

tegral,  73  for  z — Z.  Then  we  get  (verify  this !) 

= — t*  ln(*2  + y2)  + v-  ln[N  — z -f  + y2  + (N  — -)2] 


-f  jx  ln[2V  + 2 + y x2  + y2  + (N  + z)2] 
= — u ln(*2  + y2)  + PN  (x,  y,  z ) 


(28) 


where  PN  denotes  the  sum  of  the  “long  logarithms”.  Hence,  the  poten- 
tial at  the  fixed  point  Q(x,  y,  z),  which  potential  depends  on  the  length 
of  the  string,  becomes  infinite  when  the  length  of  the  string  tends  to 
infinity.  In  nature  we  never  have  to  do  with  infinitely  long  strings 
with  constant  charge  density  on  them.  In  computing  the  potential, 
it  is  essential  that  when  the  string  is  of  finite  length,  the  quantity 
N,  which  characterizes  the  length,  enters  into  the  answer  even  when 
N -»  CO. 

It  is  a remarkable  fact  that  despite  this  the  field  E near  the  string 
is  independent  of  the  length  of  the  string.* 

The  exact  mode  of  operations  is  as  follows:  fix  some  large  but 
finite  N and  find  the  field  by  the  general  rule  (20) : 


Er  = 2tx  — 


x2  + y2 


3Py_ 

dx 


= 2:. 


Iff 

(N  - z + y'x2  + y2~+  (N  - z)2)  / X2  4 - y2  + (N  - z)2 

+ similar  term  with  N + z 


The  expression  Ey  is  obtained  by  substituting  y for  x: 
dPN  -1 


E = 


a*  yX2  + /+  (N  - ,)■ 

After  differentiating,  let  N go  to  infinity,  then  the  terms 

dPN 


- )■-  similar  term  with  N z 

dPN  dPN 


dz 


dx  dy 

0 and  we  obtain  a simple  answer  that  is  not  dependent  on  N : 


E„  = 2, 


x 2 + y 2 


E„  = 2 1 


x2  + y 2 


Ez  = 0 


For  this  assertion  to  hold,  the  following  inequalities  must  be  fulfilled: 

\ x\  \y\  1 z I 

- — - > 9 -1  - - • 1.  Verify  that  if  at  least  one  of  these  quantities  satisfies 

N N N 


the  reverse  inequality,  that  is, 


iff 

N 


> 1,  or 


izi 

N 


> 1,  or 


N 


• > 1,  then  the 


potential  and  the  field  are  close  to  the  field  of  a finite  charge  equal  to  2 puV 
and  located  at  the  coordinate  origin. 
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We  get  along  without  the  detailed  procedure  of  obtaining  9 
for  a finite  N,  of  finding  E and  then  passing  to  the  limit  as  iV->  00. 
We  dispense  with  normalization  of  the  potential  9 by  the  condition 
9 = 0 at  infinity.  For  example,  we  take  for  zero  the  potential  at  the 
point  x = 1,  y = 0,  z — 0 and  denote  by  O the  thus  normalized 
potential.  It  results  from  adding  the  constant  C = CN  to  (28) : 

<D  = — fA  ln(*2  + y2)  + PN{x,  y,  z)  + C 

whence 

C = - iy  1,  0,  0)  = - 2jx  ln(2V  + Y 1 + N-) 

It  is  easy  to  see  that  in  the  limit,  as  N-*  oo,  we  have 

PN(x,  y,  z)  + C = PN{x,  y,  z)  — PN{  1,  0,  0)  ->0 

whence 

O = — pi  ln(x2  + y2) 

Computing  E = — grad  O,  we  get  the  preceding  result,  which  coin- 
cides with  what  was  obtained  via  the  Gauss  theorem  (cf.  formula  (26)). 
Unlike  the  case  of  the  potential,  when  computing  the  field  via  the 
Gauss  theorem,  we  can  (and  must !)  consider  at  once  an  infinite  string. 

The  field  can  also  be  found  by  integrating:  by  adding  the  ele- 
mentary fields  dE  of  the  charges  dq  = j idt  located  on  an  element  of 
the  string.  Since  E is  a vector,  we  have  to  add  separately  the  compo- 
nents 

= *J± dK 

[x2  4-  y2  -j-  (z  - ^)2]3/2  [x2  -f  y2  -f  {Z  — Q2f/2 


dE„  = 


[x2  + y'2 


dE . = 


y.(*  - 0 d'Z 

{X2  + y2  + t 


The  corresponding  integrals  converge,  since  for  large  £ the  integrand 
diminishes  as  | C|-3  or  | £|-2.  For  this  reason  we  can  at  once  take 
the  integrals  from  C=  — 00  to  C = 00  (see  Exercise  1). 

Finally,  let  us  consider  a field  generated  by  a charge  that  is 
uniformly  distributed  over,  say,  the  yz-plane  with  constant  surface 
density  v.  Here  again  it  is  convenient  to  use  the  Gauss  theorem.  Note 
that  because  of  symmetry  the  lines  of  force  must  be  parallel  to  the 
*-axis.  Let  us  take  a thin  cylinder  (Fig.  126)  located  symmetrically 
about  the  yz-plane.  The  flux  through  its  lateral  surface  is  equal  to 
zero,  whereas  through  the  bases  it  is  equal  to  2Eds,  again  because 
of  symmetry.  On  the  other  hand,  by  Gauss'  theorem  this  flux  is 
equal  to  the  product  of  in  by  the  charge  enclosed  in  the  cylinder, 
i.e.  by  v ds.  Equating  the  results,  we  get  2 E ds  = 4ttv  ds , whence 
E = 27tv.  Thus,  in  the  given  example, 

E ==  2-vi  (x  >0),  E = — 27:vi  (x  < 0)  (29) 
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Fig.  126 


We  see  that  the  intensity  on  both  sides  of  the  charged  plane  is  con- 
stant and  experiences  a jump  of  4?rvi  in  a passage  through  this  plane. 

What  will  happen  if,  as  is  usual  in  practical  situations,  not  the 
infinite  plane  but  a finite  portion  of  the  plane  is  charged  ? Then  near 
this  portion  (not  near  the  boundary  but  inside)  the  finiteness  of  the 
piece  has  a weak  effect  and  we  have  nearly  a homogeneous  field  that 
is  computed  from  formulas  (29).  Near  the  boundary  of  the  piece  and 
also  at  some  distance  from  it,  the  homogeneity  of  the  field  is  substan- 
tially disrupted.  At  a sufficient  distance  from  the  piece  (as  in  the 
case  of  any  finite  system  of  charges)  it  is  possible  to  replace  it  with 
a point-charge  field  equal  to  the  total  charge  on  the  indicated  piece,, 
which  means  we  can  take  advantage  of  formula  (19). 

An  important  case  is  that  shown  schematically  in  Fig.  127.  Here 
we  have  a plane  capacitor:  two  parallel  plates  with  equal  and  opposite 
charges  on  them  (v  units  per  cm2  on  one  plate  and  — v units  per  cm3’ 
on  the  other  plate).  Check  to  see  that  to  the  left  of  A and  to  the  right 
of  B the  fields  of  the  two  planes  are  in  opposite  directions  and  together 
equal  zero.  Inside  the  capacitor,  between  A and  B}  the  field  is  equal 
to  E — 4txv  and  directed  from  the  positive  plate  to  the  negative  plate. 
Bear  in  mind,  however,  that  this  field  is  half  due  to  the  charges  of 
plate  A and  half  due  to  the  charges  of  plate  B.  In  particular,  the  force 


acting  on  plate  A is  equal  to  2rcSv2 


qE , since  A is  acted  upon 


solely  by  the  field  of  charges  B (cf.  HM,  Sec.  8.4). 

The  coefficient  ^ can  also  be  obtained  in  this  way : the  charges 


on  A are  located  on  the  boundary  between  the  region  where  E ~ 0* 
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Fig.  127 


A B 

(to  the  left  of  A)  and  where  E = A m (to  the  right  of  A).  The  mean 
field  is  therefore  equal  to  — {Anv  + 0)  = 27rv.  But  this  averaging  is 

precisely  the  method  for  considering  only  that  portion  of  the  field 
that  is  created  by  B charges  and  is  the  same  to  the  left  and  to  the 
right  of  A.  The  field  of  charges  A themselves  is  of  opposite  sign  to 
the  left  and  right  of  A and  therefore  vanishes  in  an  averaging. 

Now  let  us  take  up  the  case  where  the  field  is  obtained  by  a 
direct  superposition  of  the  fields  of  the  point  charges. 

We  consider  a combination  of  two  charges,  + q and  — q,  at 

the  points  f 0,  oj  and  | — ^ , 0,  o)  . Denoting  by  r+  and  r_  the 

radius  vectors  of  these  points,  we  get,  by  virtue  of  (21),  the  total 
potential: 


Since  the  field  is  symmetric  about  the  # -axis,  it  suffices  to  imagine 
it  in  a plane  passing  through  this  axis,  say  in  the  #2-plane.  The  solid 
lines  in  Fig.  128  are  the  traces  of  the  intersection  by  the  xz-p\a.ne  of  the 
equipotential  surfaces  that  result  if  we  equate  the  right  side  of  (30) 
to  a constant  (these  lines  are  closed  oval  surfaces  of  the  eighth  order). 
The  dashed  lines  are  lines  of  force.  They  originate  on  the  positive 
charge  (the  “source”  of  the  lines  of  force  having  “strength”  Ar.q\  see 
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Sec.  10.7)  and  terminate  on  the  negative  charge  (this  is  the  '‘sink”  ; see 
Sec.  10.7).  By  virtue  of  the  relation  (20),  these  lines  are  normal  to  the 
cquipotential  surfaces  (they  intersect  them  at  right  angles). 

If  h is  infinitely  small,  that  is,  if  the  source  and  the  sink  are 
located  infinitely  close  together,  then  the  system  of  charges  is  called 
a dipole.  However,  if  in  this  situation  the  charges  themselves  remain 
finite,  then  their  fields  merely  cancel  out  to  yield  a zero  field.  For 
this  reason,  the  only  interest  is  in  the  case  where  the  charges  are  infi- 
nitely large  and  such  that  the  product  m = qh  (this  product  is  called 
the  dipole  moment)  remains  finite.  In  space,  a dipole  is  characterized 
not  only  by  its  location  and  moment  but  also  by  its  direction ; to 
obtain  the  direction,  draw  the  axis  through  the  charges  in  the  direc- 
tion from  the  negative  to  the  positive  charge. 

To  derive  the  field  of  a dipole,  consider  Fig.  129,  where  l is  the 
axis  of  the  dipole.  For  a sufficiently  small  h,  at  any  point  M we 
will  have 


E = — r, — r = qh  ^ - m - - (31) 

rl  7"3  h dl[r 3j 

where  — denotes  the  derivative  obtained  under  a change  in  the  position 

(Ur 

of  the  charge  (it  differs  by  the  factor  — 1 from  the  derivative  obtained 
under  a change  in  the  position  of  the  observation  point  M).  Simpli- 
fying the  right  side  of  (31),  we  get  (verify  this!) 

■p  fl  dr  3r  dr\  m f 3r  lft"| 

E = ml — — = — — cos  a — 1° I 


where  1°  is  the  unit  vector  along  the  /-axis  and  a is  the  angle  between 
the  vectors  1°  and  r. 


368 


Field  theory 


CH.  10 


A similar  consideration  of  the  potential  of  the  dipole  yields  the 


expression 


If  we  assume  the  axis  of  the  dipole  to  coincide  with  the  #-axis 
and,  like  Fig.  128,  consider  the  picture  in  the  xz- plane,  then  we  get 


cos  a = xlr,  or  cp  = m — = m — — - — ^ . Fig.  130  shows  the  traces  of 

' r (X*  + Z2)  * 

the  equipotential  surfaces  (which  are  oval  surfaces  of  order  six)  and  the 
electric  lines  of  force,  which,  as  we  see,  originate  on  the  dipole  and 
terminate  there.  (Think  about  how  to  obtain  Fig.  130  from  Fig.  128). 
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Exercises 

1.  Using  integration,  compute  the  field  of  an  infinite  homogeneous 
charged  rectilinear  string. 

2.  Using  integration,  compute  the  field  of  an  infinite  homogeneous 
charged  plane. 

3.  Compute  the  field  and  the  potential  of  a “plane  dipole”  which 
is  the  result  of  superimposing  two  oppositely  charged  homo- 
geneous rectilinear  infinitely  close  strings. 

10.7  General  vector  field  and  its  divergence 

It  is  now  easy  to  formulate  the  foregoing  concepts  in  general 
form  for  a certain  stationary  vector  field  A = A (M),  irrespective 
of  the  physical  meaning.  It  is  natural  to  generalize  the  concept  of 
vector  lines  as  lines  along  the  field  at  every  point.  As  in  Sec.  10.4,  we 
introduce  the  concept  of  the  flux  of  a vector  A through  an  oriented 

surface  (a)  by  the  formula  Q=[And<j  = \ A •da.  This  flux  is 

(o)  (o) 

otherwise  spoken  of  as  the  number  of  outward  vector  lines  inter- 
secting (or). 

As  in  Sec.  10.5,  we  can  regard  the  vector  flux  A through  a closed 
surface  (a), 

Q=^A-da  (32) 

(o) 

(the  circle  on  the  integral  sign  is  not  obligatory,  since  it  merely  serves 
to  emphasize  that  the  integral  is  taken  round  a closed  surface).  Such 
a flux  has  the  following  simple  property:  if  a certain  body  (Q)  is 
partitioned  with  the  aid  of  certain  surfaces  into  parts  (Qx),  (Q2),  ..., 
then  the  outward  flux  of  the  field  through  the  surface  of  the  body 
(£})  is  equal  to  the  sum  of  analogous  fluxes  taken  for  each  of  the 
bodies  (Qj),  (£22),  ...  . For  example,  in  Fig.  131  we  have  a partition 
of  (Q)  into  the  two  parts  (Qt)  and  (£22).  The  flux  of  the  vector  A 
through  the  surface  of  the  body  (Qj)  can  be  represented  as  a sum 
of  two  integrals,  over  (cq)  and  over  (<r3) ; now  the  flux  through 
the  surface  of  the  body  (Q2)  is  equal  to  the  sum  of  the  integrals 
over  (ct2)  and  over  (ct3).  If  these  fluxes  are  combined,  the  inte- 
grals over  (<r3)  cancel  out  (why?),  and  the  integrals  over  (ctj) 
and  (a2)  together  yield  the  flux  of  vector  A through  the  surface  of 
the  body  (Q). 

This  property  enables  us  to  interpret  the  integral  (32)  as  the 
number  of  vector  lines  originating  inside  the  volume  (£i)  bounded 
by  the  surface  (a).  If  Q > 0,  we  say  that  there  is  a source  of  vector 
lines  in  (SI),  and  Q is  called  the  strength  of  this  source.  If  Q < 0,  we 
say  that  there  is  a sink  (or,  what  is  the  same  thing,  a source  of  nega- 
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Fig.  131 


tive  strength)  in  (Q).  For  the  sake  of  simplicity  we  will  always  regard 
a sink  as  a special  case  of  the  source.  If  Q — 0,  then  either  there  are 
no  sources  or  sinks  in  (Q)  or  they  cancel  out ; incidentally,  even 
when  Q =£0,  there  may  be  either  sources  or  sinks  in  (Cl),  which, 
however,  are  only  partially  compensated  for  in  this  case. 

Suppose  the  vector  lines  arise  throughout  the  space.  We  can 
speak  of  the  mean  density  of  the  source  Q/Cl  in  any  volume  (Q)  (by 
Cl  is  meant  the  numerical  value  of  the  volume  (Q))  and  we  can  speak 
of  the  density  of  the  source  at  any  point  M of  the  space.  This  density 
is  equal  to 

^ A • da 

lim  = lim  ^ (33) 

(An)-^Af  AQ  (ao)h-m  AH 

where  (AQ)  is  a small  volume  containing  the  point  M and  (Ag)  is 
the  surface  of  the  small  volume.  This  density  of  the  source  is  also 
known  as  the  divergence  of  the  vector  field  A and  is  denoted  by  div  A. 
We  can  thus  say  that  the  divergence  of  a vector  field  is  the  number 
of  vector  lines  originating  in  an  infinitely  small  volume  (or,  what 
is  the  same  thing,  the  flux  of  field  A through  the  surface  of  this  vo- 
lume) referred  to  a unit  of  this  volume.  Note  that  the  divergence 
of  a vector  field  is  a scalar  quantity  or,  to  be  more  precise,  it  forms 
a scalar  field,  since  it  assumes  a value  at  every  point  of  space. 

Formula  (33)  may  be  rewritten  as 

divA  = -^-,  i.e.  dQ  = div  AdCl  (34) 

dQ 

The  result  is  an  expression  for  the  number  of  vector  lines  originating 
in  an  elementary  volume  (dQ).  Summing  (Sec.  4.7),  we  get  an  expres- 
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•■inn  for  the  number  of  vector  lines  originating  in  a finite  volume  (Q), 
ir.  for  the  flux  of  vector  A, 

A • dxs  = ^ div  A dQ  (35) 

(a)  (U) 

where  (Q)  is  a finite  volume  and  (a)  is  its  surface.  This  important 
formula  is  called  the  Ostrogradsky  formula  (also  known  as  the  diver - 
rynce  theorem). 

It  is  sometimes  convenient  to  compute  the  divergence  directly 
on  the  basis  of  its  definition  (33).  Consider,  for  example,  a central-sym- 
metric field  in  space  defined  by  the  formula 

A = fir)  r°  (36) 

As  was  shown  in  Sec.  10.4  (formulas  (16)  and  (17)),  the  field  flux 
through  a sphere  of  radius  r with  centre  at  the  origin  is  equal  to  Q(r)  = 
- 4nr2f(r),  and  so  the  number  of  vector  lines  originating  in  a thin  layer 
between  two  such  spheres  is  equal  to  dQ  = 47 zd[r2f{r)}  = 4tt[2 rf(r)  + 
+ r2ff(r)]dr. 

Hence,  per  unit  volume  of  this  layer  we  have 

div  A = t~t  = —f(r)  + f [r]  (37) 

4: xr*  ar  r 

In  particular,  a central-symmetric  field  without  sources  and 
outside  the  coordinate  origin  is  characterized  by  the  fact  that 

— f{r)  -f-  f'(r)  = 0,  whence  f(r)  — — (C  constant) 

r r2 

We  have  arrived  at  the  Newtonian  law  (10).  By  virtue  of  formula 
(17),  Cr~ 2*  47 ir2  = AtzC  vector  lines  originate  at  the  very  origin  of 
coordinates.  But  a point  source  of  strength  4t:C  at  the  origin  has 
a density  of  4nC8(r).  Thus,  by  (36), 

div(£rj  = 47:CS(r)’  Le-  div(^)=4**(r)  (38) 

In  this  important  instance,  forgetting  the  delta  function  would  have 
resulted  in  a gross  error!  Here  we  have  a three-dimensional  delta 
function  (see  Sec.  6.3):  S(r)  = 8(*)8(y)$(£).  Using  the  delta  function 
is  much  more  convenient  than  long  verbal  statements  like  “the  di- 
vergence is  everywhere  zero  except  at  a singular  point  of  the  field, 
where  it  is  infinite",  and  the  like.  The  origin  of  coordinates  is  a sin- 
gular point  since  the  velocity  at  this  point  is  infinite  and  does  not 
have  a definite  direction.  In  this  example,  all  vector  lines  go  to  in- 
finity. Think  over  where  the  vector  lines  originate  and  where  they 
terminate  (or  where  they  go  to)  if  f(r)  increases  more  slowly  or  faster 
than  r~ 2 as  0. 
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Fig.  132 


The  definition  (33)  of  divergence  was  given  in  invariant  form, 
which  does  not  depend  on  the  choice  of  coordinate  system.  It  is  also 
of  interest  to  derive  the  formula  for  computing  the  divergence  with 
the  aid  of  an  ^yz-coordinate  system.  To  do  this,  we  take  advantage 
of  the  fact  that  in  formula  (34)  the  form  of  the  elementary  volume 
(dQ)  is  inessential,  and  for  this  volume  we  choose  an  infinitesimal 
rectangular  parallelepiped  with  edges  parallel  to  the  coordinate  axes 
(Fig.  132).  Then  the  flux  dQ  of  the  vector  A through  the  surface  of 
the  parallelepiped  (that  is,  the  numerator  of  the  fraction  on  the 
right  of  (34))  can  be  represented  as  a sum  of  six  terms  corresponding 
to  the  six  faces  of  the  parallelepiped.  Let  us  consider  the  sum  of  two 
of  these  terms  that  correspond  to  the  back  and  front  faces,  we  denote 
them  by  I and  II  respectively.  Then  (An)n  — (. Ax)Ut  (An)x  = — ( Ax)j 
(why?),  and  for  this  reason  the  indicated  sum  can  be  written  as 

((  A„  da^  + |^4n  = - ( (Ax) j da  + ^ (Ax)u  da 

= ^[(Ax)n-{Ax){\,da 

To  within  higher  order  infinitesimals,  the  integrand  is  equal  to 
dxAx  = dx . This  is  the  "partial  differential”  of  Ax  with  res- 

dx 

pect  to  x resulting  from  the  fact  that  the  points  of  the  front  face 
differ  from  the  corresponding  points  of  the  back  face  by  the  value 
of  the  ^-coordinate.  For  this  reason  the  entire  integral  is,  up  to 
higher  order  terms,  equal  to 

dx\da=  dx  dy  dz 

dx  J dx  J 


Sec.  10.7  Vector  field  and  its  divergence 


373 


Carrying  out  similar  computations  for  the  other  two  pairs  of  faces, 
we  get  the  entire  elementary  flux: 


dQ  = [ 


1 

«| 

ns 

My 

\ dx 

r dy 

^ dz  j 

) dx  dy  dz 


But  since  dV=  dxdydz,  by  the  first  formula  of  (34)  we  finally  get 


div  A = 


dAx 


dAh 


dx 


dy 


+ 


dAz 


dz 


(39) 


In  conclusion,  note  the  following.  From  formula  (33)  it  is  readi- 
ly apparent  that  the  vector  flux  through  a small  closed  surface 
is  of  the  order  not  of  the  area  of  the  surface  (as  one  might  expect 
from  the  surface  integral)  but  of  the  volume  bounded  by  it,  that  is,  it 
is  an  infinitesimal  of  third  order  and  not  of  second  order  compared 
with  the  linear  dimensions.  This  can  be  explained  as  follows.  The 
vector  flux  through  an  open  surface  is  indeed  of  the  second  order. 
However,  the  flux  of  a constant  vector  through  a closed  surface 
is  always  equal  to  zero ! (This  property  follows  easily  from  the  pre- 
ceding material ; prove  it,  for  instance,  with  the  aid  of  the  last  formula 
of  Sec.  10.4  or  of  formula  (35).)  For  this  reason,  if  the  field  within 
a small  closed  surface  is  expanded  by  Taylor's  formula  (Sec.  6.6), 
then  the  integral  of  the  first  constant  term  yields  zero,  while 
the  integral  of  first-order  terms  will  be  of  the  third  order  of 
smallness. 

This  remark  does  not  refer  to  the  case  where,  inside  the  closed 
surface,  there  are  singularities  of  the  field  at  which  the  volume 
density  of  the  source  becomes  infinite.  If  there  is  a source  of  vector 
lines  distributed  over  the  area  within  a surface,  the  flux  is  of  second 
order;  if  the  source  is  distributed  along  a line,  then  the  flux  is 
of  first  order  (proportional  to  the  length  of  the  line),  finally,  the 
flux  through  a surface  inside  which  there  is  a point  source  (i.e.  the 
divergence  is  of  the  nature  of  the  delta  function)  is  not  an  infi- 
nitesimal at  all.  It  is  precisely  the  case  that  we  encountered 
above  in  the  extremely  important  instance  of  a Coulomb  field  (see 
formula  (38)). 


Exercises 

1.  Use  the  Ostrogradsky  formula  to  find  the  outward  flux  through 
the  sphere  x*  + y2  -f-  z2  — R 2 of  the  fields  A = 2xi  + ykt  B = 

= y2i  — *2j. 

2.  Derive  the  formula  (37)  with  the  aid  of  (39). 

3.  Find  the  divergence  of  the  plane-parallel  axial  symmetric  field 
A = /( | p | ) p°.  In  what  case  is  this  field  devoid  of  sources 
outside  the  z-axis? 
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10.8  The  divergence  of  a velocity  field  and  the  continuity 

equation 

Let  us  now  return  to  a consideration  of  the  stationary  flow  of 
a gas  (see  Sec.  10.4) ; we  assume  that  the  mass  of  the  gas  does  not 
change  in  the  process.  In  Sec.  10.4  we  established  the  fact  that  the 
flux  of  the  velocity  vector  v through  an  oriented  surface  (a)  is  equal 
to  the  volume  of  the  gas  carried  outward  through  (a)  in  unit  time. 
Hence  if  the  surface  (cr)  is  closed,  then  this  flux  is  equal  to  the  dif- 
ference between  the  volume  emerging  in  unit  time  from  (cr)  over 
that  portion  of  the  surface  where  v is  directed  outwards,  and  the 
volume  entering  through  (cr)  over  the  same  portion  of  surface  where 
v is  directed  inwards. 

Why  can  the  vector  flux  v through  a closed  surface  be  dif- 
ferent from  zero?  If  the  gas  is  taken  to  be  incompressible  during 
the  flow  process  (such  is  the  case  for  flow  velocities  considerably 
less  than  the  speed  of  sound  and  also  for  liquids),  then  the  incom- 
ing volume  is  equal  to  the  outflow  and  so  the  indicated  flux  is 
zero.  However,  if  (a)  is  in  a zone  where  the  gas  expands  in  its  flow, 
then  the  outgoing  volume  is  greater  than  the  incoming  volume  and 
therefore  the  overall  flux  will  be  positive,  similarly,  in  the  compres- 
sion zone  the  overall  flux  is  negative. 

It  is  now  clear  what  the  divergence  of  the  velocity  vector  means. 
By  virtue  of  Sec.  10.7,  to  obtain  div  v it  is  necessary  to  divide  the 
flux  of  vector  v through  the  surface  of  an  infinitesimal  volume 
(dQ)  by  the  numerical  value  dQ  of  this  volume.  We  thus  have  a 
ratio  of  the  volume  “originating”  in  (dQ,)  in  unit  time  to  the  volume 
dQ.  It  is  natural  to  call  this  ratio  the  “rate  of  relative  increase  in 
volume”.  From  this  it  is  clear  that  the  equation  div  v = 0 is  the 
condition  for  incompressibility  of  a gas,  whereas  the  relations  div 
v > 0 and  div  v < 0 hold,  respectively,  in  the  zone  of  expansion 
and  the  zone  of  compression. 

Consider  the  field  of  “mass  velocity”  pv.  In  Sec.  10.4  we  showed 
that  the  flux  of  such  a vector  through  an  oriented  surface  (a)  is  equal 
to  the  mass  of  the  gas  carried  outwards  in  unit  time  through  (ct). 
But  since  the  mass  of  the  gas  remains  unchanged,  the  total  flux 
of  the  vector  pv  through  any  closed  surface  is  necessarily  equal  to 
zero,  since  the  influx  of  mass  is  the  same  as  the  efflux.  This  also 
holds  for  an  infinitesimal  volume  and  so,  by  Sec.  10.7,  we  arrive 
at  the  equation 

div  (pv)  =0  (40) 

for  stationary  flow  in  which  the  mass  of  gas  in  every  element  of 
volume  remains  unchanged  with  time.  This  is  the  continuity  equation , 
which  is  one  of  the  basic  relations  of  hydrodynamics  that  expresses 
the  law  of  conservation  of  mass. 
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Let  us  write  out  this  expression  in  coordinates.  The  vector  pv 
dearly  has  the  components  pvx,  pvyy  pvz,  and  so 


div  (pv)  = i-  (p»J  + (pv,)  - 

dx  dy 

__  _ / dvx_  dvy  dvd  \ _ 

^ \ dx  dy  dz  J 

= p div  v + v • grad  p 


dp  . dp  , dp 
■ — + Vv  — -f-  Vz  — 
dx  J dy  dz 


(this  will  be  derived  differently  in  Sec.  11.8).  Hence,  the  condition 
for  conservation  of  mass  leads  to  the  relation 


div  v = v • grad  p 

P 

For  an  incompressible  fluid  p = constant,  grad  p = 0,  div  v = 0. 
What  is  the  quantity  v • grad  p in  the  stationary  flow  of  a com- 
pressible fluid  [such  precisely  is  the  flux  we  are  considering,  and 
for  it,  div  (pv)  =0]?  Let  us  find  the  variation  of  density  of  a par- 
ticle. In  a stationary  flux  p ==  p(x,  y,  z,)  the  density  at  every  point 
does  not  depend  on  t,  i.e.  dp/dt  = 0.  But  when  we  consider  a given 
particle , it  is  necessary  to  take  into  account  that  its  coordinates 
depend  on  the  time.  The  velocity  v is  precisely  what  characterizes 
this  relationship: 


dx  dy 

=vx>  — 

dt  dt 


dz 

dt 


= V9 


By  the  general  rule  (cf.  formula  (5)  of  Ch.4)  and 
p = pf %(t),  y(t),  z(t)]  = p(f),  we  find 


Thus 


do  dp  dx  , dp  dy  , dp  dz  , 

— = — F — — + — — = v • grad  p 

dt  dx  dt  dy  dt  dz  dt 


div  v — — 

p dt 


considering 


This  is  the  quantitative  expression  of  the  earlier  statements  con- 
cerning div  v in  a region  where  the  gas  is  compressed  ( dp/dt  > 0) 

or  expands  ( dp/dt  < 0)  for  — = 0. 

dt 

Now  let  us  consider  a nonstationary  gas  flow  when  the  velocity 
field  and  other  fields  vary  with  time.  All  the  earlier  introduced  no- 
tions will  have  to  be  considered  (as  in  the  case  of  a nonstationary 
field  with  arbitrary  physical  meaning)  for  any  fixed  instant  of  time. 
Then  the  flow  lines  (i.e.  lines  which  are  constructed  for  any  fixed 
time  and  at  that  time  lie  along  the  velocit}^  vector  at  every  point) 
no  longer  coincide  with  the  particle  paths  of  the  liquid  (give  some 
thought  to  why  this  is  so!). 
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If  in  the  nonstationary  case  we  consider  a volume  {d£l),  then 
the  vector  flux  pv  through  the  surface  is  equal,  by  (34),  to  div(pv) 
and  yields  the  rate  of  decrease  in  mass  of  gas  in  that  volume,  that 
is,  the  mass  of  gas  emanating  from  this  volume  in  unit  time.  How- 
ever, the  mass  of  gas  contained  in  an  infinitesimal  volume  (d£l)  is 
equal  to  pd£l  and  for  this  reason  increases*  at  the  rate  of 

- (p  dQ)  = ^dQ 
dt  dt 

We  thus  get 

div(pv)  da  = - ^ dCl 

dt 

whence 

~ + div(pv)  = 0 (41) 

This  is  the  continuity  equation,  which  for  the  nonstationary  case 
as  well  signifies  that  in  the  process  of  motion  the  gas  is  conserved 
(does  not  appear  or  disappear).  In  the  stationary  case,  equation  (41) 
clearly  becomes  equation  (40). 

In  the  nonstationary  case,  the  rate  of  change  — of  density  in 

dt 

dp 

a particle  (in  contrast  to  the  rate  of  change  — of  density  in  a 

dt 

fixed  point)  is  equal  to 

^ = ~ + V ■ grad  p 
dt  dt 

From  this  we  find  that  in  the  nonstationary  case  the  formula 

div  v = — 

p dt 

remains  unchanged. 

Exercise 

Consider  a flux  of  "neutron  gas”  in  a reactor,  the  neutrons 
entering  the  flux  throughout  the  whole  volume  due  to  fission 
of  uranium  nuclei  in  the  volume.  How  will  the  continuity 
equation  change? 

10.9  The  divergence  of  an  electric  field  and  the  Poisson 
equation 

Let  a charge  with  a certain  density  p be  distributed  in  space 
and  let  it  generate  an  electric  field  E.  In  Sec.  10.5  we  saw  that  the 
flux  Q of  the  vector  E through  a surface  of  any  volume  (Q),  i.e. 


* 


The  word  "increases”  is  to  be  understood  in  the  algebraic  sense:  if  dp/  dt  > 0. 
the  mass  inside  d£l  increases,  and  if  dpjdt  < 0,  the  mass  decreases. 
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1 1 if  number  of  electric  lines  of  force  originating  in  (Q),  is  equal  to 
ili«  product  of  4k  by  the  charge  q contained  in  (Q).  From  this,  if 
i In  volume  is  infinitesimal,  we  get 


"i,  finally, 


divE  = £0=^f 2.  = 4t3 

d£l  dQ  dil 


div  E = 47rp 


(42) 


llius,  the  divergence  of  the  intensity  of  an  electric  field,  i.e.  the 
* If  i isity  of  origination  of  electric  lines  of  force,  is  directly  proportio- 
n. 1 1 to  the  density  of  distributed  charges. 

In  Sec.  10.5  we  saw  that  an  electric  field  has  a potential  9 
« ounected  with  the  vector  E by  the  relation  (20).  Substituting  (20) 
into  (42),  we  get  the  equation  of  the  potential: 

div  grad  9 = — 4?rp 


llie  combination  div  grad  is  called  the  Laplacian  operator  and  is 
dr  noted  by  A;  we  can  therefore  rewrite  the  equation  for  the  poten- 
tial as 


A9  — — 47Tp 


(43) 


111  is  is  the  Poisson  equation , its  solution  (23),  equal  to  zero  at  in- 
finity, was  found  earlier.  In  the  portion  of  space  free  from  charges, 
rq nation  (43)  becomes 

A9  = 0 (44) 


and  is  called  the  Laplace  equation. 

It  is  easy  to  obtain  an  expression  for  the  Laplacian  operator 
ni  Cartesian  coordinates.  From  formulas  (1)  and  (39)  we  get 

div  grad  9 = divf  — i+— j + — k] 

[ dx  dy  dz  ) 

= Jh  (??  1 . A_  ] , d r dy  \ = ^'9  , 32(P  . 92y 

dx  y dx  ) dy  1.  dy  J dz  \ dz)  dx2  ' dy2  dz2 


So  equation  (44)  is  the  same  Laplace  equation  we  encountered  in 
Sec.  5.7  but  written  for  the  three-dimensional  case. 

Sometimes  it  is  more  convenient  to  compute  the  Laplacian  ope- 
rator in  other  coordinate  systems.  For  example,  for  a central-sym- 
metric field  9 (r),  we  get,  by  formulas  (5)  and  (36)-(37), 

A 9(f)  = div  grad  9 (r) 


f?  2 

dr2  r 


dr 


The  Laplacian  operator  is  connected  by  a simple  formula  with 
the  mean  value  of  the  function  on  a small  sphere.  To  derive  this 
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relationship,  set  A = grad  9 in  the  Ostrogradsky  formula  (35).  We 
then  have 


^ gradM  9 do  = ^ 

(O)  («) 


div 


grad  9<£Q,  or 


Suppose  that  (a)  = (u)r  is  a sphere  of  radius  r with  centre  at  some 
point  0.  Then  da  = r2da,  where  da  is  an  element  of  the  solid 
angle  with  vertex  0 (Sec.  10.5),  whence 

^ il  r*da  = ^ A9  dQ,  or  ^ 9 do  = ~ ( A9  rfQ 

(o)r  (12)  r (H)r 


(We  passed  to  integration  with  respect  to  <o  so  that  the  range  of 
integration  should  not  be  dependent  on  the  parameter  r and  so  that 
it  would  be  possible  to  apply  the  rule  of  Sec.  3.6  for  differentiating 
an  integral  with  respect  to  a parameter.)  Integrating  this  last  equation 
with  respect  to  r,  beginning  with  r = 0,  we  get 

r 

^ 9 da  — ^9  da  | = ^ ~ ^ A9  dL 2 

0 5 (Q)5 

But  9 jr=o  = 9(0)  = constant,  da  = 47:  (why?),  whence,  dividing 
by  4iz,  we  get 

r 

^ <«) 

(o)r  0 (H)s 


We  will  derive  certain  consequences  from  this  formula.  First 
of  all,  suppose  that  A9  = 0 in  some  region  including  the  sphere 
(Q)r;  then  the  function  9 is  called  a harmonic  function  (cf.  Sec.  5.7) 
and  by  (44) 

<?(°)  = ^ 9 (46) 

(°)r 

Thus,  the  value  of  the  harmonic  function  at  the  centre  of  a sphere 
is  equal  to  the  mean  value  of  this  function  on  the  sphere. 

From  this  it  follows  in  particular  that  a harmonic  function  can- 
not have  any  maxima  or  minima  inside  the  domain  of  its  definition. 
Indeed,  if,  say,  there  were  a minimum  at  0 and  r is  sufficiently 
small,  then  the  mean  value  of  the  function  9 on  (a)r  would  be 
greater  than  9(0)  (why?),  but  this  contradicts  (46).  This  property 
is  demonstrated  differently  in  HM,  Sec.  6.3,  where  a corollary  to 
it  was  also  noted:  a charge  in  an  electrostatic  field  in  a region  free 
of  other  charges  cannot  have  positions  of  stable  equilibrium. 
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Another  consequence  of  formula  (45)  is  obtained  for  an  arbit- 
t.iiv  function  9 if  r is  taken  to  be  small.  Then  (A9)(q)s^  A9  \o, 
wlinice 

\ I»"j  = 

0 (n)8  0 

I1  nr  this  reason,  to  within  quantities  that  are  small  for  small  r , 

\°  = 5 ? du  - 9(0)]  = 6 9<  - ~2  — (47) 

(o)r 

wliere  9(a^  signifies  the  mean  over  the  sphere  ((r)r.  This  then  shows 
l hat,  in  particular,  if  9(°)r  = 9(0),  that  is  to  say,  if  the  above 
described  property  of  the  mean  over  spheres  holds,  then  A9  — 0, 
which  means  the  function  9 is  harmonic. 

Formula  (47)  is  very  reminiscent  of  the  formula 

= j, 

which  is  readily  verified  with  the  aid  of  the  Taylor  formula  for 
small  h for  the  function  9 (at)  of  one  variable  (do  this!).  In  many 
rases,  formula  (47)  makes  it  possible  to  get  a deeper  understanding 
of  the  meaning  of  the  relations  that  involve  the  Laplacian  operator. 

For  example,  in  the  theory  of  heat  conduction  the  equation  — = 

dt 

- constant  • Av  is  derived  for  the  propagation  of  heat  in  a homo- 
geneous medium,  where  v is  the  temperature.  By  (47)  this  means 
that  if  the  temperature  around  the  point  0 is,  on  the  average,  higher 
than  at  0,  then  the  temperature  at  0 must  increase. 

Exercises 

1.  What  is  Acp(  | p j ) equal  to,  where  p = xi  + y]  (Sec.  10.6)? 

2.  Prove  that  the  value  of  a harmonic  function  at  the  centre  of 
a sphere  of  any  radius  is  equal  to  the  mean  value  of  the  function 
on  the  sphere. 

3.  Prove  formula  (47)  by  expanding  the  function  9 in  a Taylor 
series. 

10.10  An  area  vector  and  pressure 

In  Sec.  10.4  we  said  that  an  oriented  plane  area  is  customarily 
depicted  by  a vector.  This  approach  is  particularly  natural  when 
investigating  forces  of  pressure  in  a fluid.  Indeed,  suppose  the  pres- 
sure is  p at  some  point.  This  is  of  course  a scalar  quantity.*  Suppose 


The  fluid  obeys  the  Pascal  law;  we  do  not  consider  either  strength  or  visco- 
sity, which  lead  to  distinct  forces  acting  on  areas  that  have  different  directions. 
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an  oriented  element  of  surface  (da)  is  located  at  this  point  (see 
Fig.  120).  Then  the  force  exerted  on  this  element  by  a liquid  located 
on  the  inside  of  the  surface  is  directed  outward  along  the  normal  to 
(da)  and  is  numerically  equal  to  pda\  this  force  is  thus  equal  to 

d¥  = pda  (48) 

The  vector  of  the  area  is  thus  interpreted  in  a straightforward 
manner. 

It  is  now  easy  to  compute  the  overall  force  which  the  liquid 
exerts  on  an  arbitrarily  chosen  volume  (SI)  with  surface  (a).  To  do 
this,  we  have  to  sum  the  elementary  forces  (48)  but  with  the  sign 
reversed  on  the  right  side,  since  the  surface  (a)  will,  as  in  the  preceding 
sections,  be  considered  to  have  the  inner  side  facing  (Q)  and  not 
facing  the  pressing  liquid.  We  then  get 

F = — ip  do  (49) 

(°) 

When  computing  the  integral  (49)  it  is  often  convenient  to 
pass  to  integration  over  (SI).  Note  for  this  purpose  that 

✓N  /S  /N 

p da  = pnda  = ^>[cos  (n,  x)  i + cos  (n,  y)  j -f  cos  (n,  ^)k]  da  (50) 

(see  end  of  Sec.  9.1).  On  the  other  hand,  p cos  (n,  x)  = (pi)n>  where 
the  subscript  n denotes  the  projection  on  the  outer  normal.  For 
this  reason,  by  virtue  of  the  Ostrogradsky  formula  (35)  and  formula 
(39)  for  computing  the  divergence,  we  get 

p cos  (n,  x)  da  = ^ (pi)n  = ^ div  (pi)  dSl  = ^ ~~  dSl 

(o)  (o)  (O)  (O) 

and  the  integral  of  the  first  term  on  the  right  of  (50)  turns  out  equal 

to  i C — dSl.  The  integrals  of  the  other  two  terms  are  transformed 
J dx 

(fl) 

in  similar  fashion  and  as  a result  we  get 

= + +i7k)rfa  = S (grad^^  (51) 

to)  (a) v m 

(see  formula  (1)).  The  force  of  the  pressure  is 

F = — ^ (grad  p)  dQ.  (52) 

(Q) 

Let  us  consider  some  examples.  First  let  the  pressure  be  constant, 
p = constant.  Then  grad  p — 0 and  formula  (52)  shows  that  F = 0 
as  well. 
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The  conclusion  is  that  a constant  pressure  applied  to  all  sides  of 
a body  of  arbitrarily  complex  shape  yields  a resultant  force  of  zero. 
It  is  precisely  for  this  reason  that  the  atmospheric  pressure,  which 
is  very  strong  but  acts  with  constant  force  on  the  complicated  surface 
of  our  body,  does  not  push  us  out  of  the  atmosphere.  Actually,  this 
example  was  examined  at  the  end  of  Sec.  10.4;  in  particular,  we  see 
that  if  the  polyhedron  shown  in  Fig.  121  is  put  in  a liquid,  then 

p{  — *)+  PM  + PM  + PM  = 0 

Cancelling  out  the  p , we  arrive  at  formula  (18). 

Let  us  now  consider  the  pressure  in  a liquid  located  in  a field 
of  gravitation.  If  the  axis  is  directed  upwards  from  the  free  surface, 
then  it  is  a well-known  fact  that  the  pressure  in  the  liquid  depends 
on  ^ in  the  following  manner: 

P = Po  “ g?z 

where  p0  is  the  pressure  on  the  free  surface,*  p is  the  density  of  the 
liquid,  and  g is  the  acceleration  of  gravity.  From  this,  by  (52),  we 
get  the  buoyant  force 

F = ^ gpk  d£l  = gpCl k 
(Q) 

Since  the  product  gpQ  is  equal  to  the  weight  of  the  liquid  in  the 
volume  (£2)  we  arrive  at  the  famous  Archimedean  law. 

Let  us  return  to  the  general  case  and  find  the  resultant  of  forces 
of  pressure  applied  to  an  infinitesimal  volume  (rf£2).  Since  in  the  case 
of  a constant  pressure  this  resultant  is,  as  we  have  just  demonstrated, 
equal  to  zero,  then  by  reasoning  as  we  did  at  the  end  of  Sec.  10.7  we 
can  derive  that  although  the  forces  of  pressure  are  applied  to  the 
surface , the  resultant  is  proportional  not  to  the  surface  area  but  to 
the  volume.  This  is  also  evident  from  formula  (52),  whence  it  follows 
that  for  an  infinitesimal  volume  of  integration 

dF  = — (grad  p)  dCl  (53) 

To  the  volume  (dCl)  there  can  be  applied  certain  volumetric  forces 
(say,  forces  of  gravitation,  centrifugal  forces,  and  the  like)  distributed 
with  density  f.  The  resultant  of  such  forces  is 

dFx  = fdQ. 

Both  of  these  forces  impart  to  the  mass  p dQ.  an  acceleration  of  — , or 

dt 

pd£l—=-  (grad  p)  dCl  + f d£l 

dt 


Note  that  in  the  liquid  we  obviously  have  z < 0,  so  that  the  pressure  p in  the 
formula  is  greater  than  p0,  as  it  should  be. 
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whence,  cancelling  dLl,  we  get  the  equation  of  motion: 

P ^7  = — grad  p + f (54) 

at 

In  the  left-hand  member  we  have  the  rate  of  change  of  the 
vector  v along  the  path  of  the  particle.  This  rate  is  written  down  in 
the  same  way  as  was  done  in  Sec.  4.1  for  the  case  of  a scalar  field 
(that  is,  it  consists  of  a local  rate  and  a convective  rate).  Its  coordinate 
representation  will  be  obtained  in  Sec.  11.8. 

Let  us  consider  the  stationary  flow  of  an  incompressible  (i.e. 
for  p = constant)  liquid  in  a gravitational  field.  Directing  the 
2-axis  vertically  upwards,  we  find  that  f (i.e.  the  force  acting  on  unit 
volume)  equals  — pgk.  Forming  the  scalar  product  of  both  sides 
of  equation  (54)  by  dr  = v dt , we  get  on  the  left 

p d\  • v = y d(v  • v)  = d(v2)  (here,  v ==  \ v j ) 

On  the  right  side,  the  first  term  after  multiplying  yields 
— grad  p • dr  = — dx  + y-  dy  + ~~  dz j = — dp 

and  the  second  term,  — pgk  • dr  = — p g dz.  Thus, 

^d(v*)  = -dp-  pgdz 

It  is  to  be  stressed  that  this  relation  holds  true  along  flow  lines ; we 
made  use  of  this  fact  when  we  wrote  dr  = v dt,  since  this  equation 
determines  the  displacement  of  a moving  particle  along  the  trajectory 
(see  Sec.  9.4).  Integrating  along  such  a curve,  we  get  the  relation 

y-  + P + p gz  ~ constant  (55) 

which  should  hold  true  in  a stationary  flux  along  every  flow  line. 
Here  the  constant  of  integration  on  the  right  assumes  distinct  values 
on  distinct  flow  lines.  The  relation  (55)  is  called  the  Bernoulli  integral 
and  expresses  the  law  of  conservation  of  energy  in  the  motion  of  a 
particle  of  liquid.  This  is  particularly  clear  if  we  multiply  both  sides 
by  dD.  and  take  the  increment  A (do  not  confuse  this  with  the  Laplacian 
operator !)  of  both  sides  of  the  equation  along  a certain  segment  of 
the  flow  line.  We  then  have 

A y-  d£l  -j-  A p dd  -f-  Ap gz  d£l  = 0 

A^dm_v*  + dm  ,£2j  = _Ap.  da 


or 
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^ince  in  the  brackets  vve  have  the  sum  of  the  kinetic  energy  and  the 
l>i)(ential  energy  of  the  moving  particle,  the  left-hand  side  is  equal 
in  the  increment  in  total  energy  of  that  particle.  The  right  side  may 
!'<•  represented  in  the  form  of  an  integral  along  the  flow  line: 


_ ds  ■ d£l  = - jj  (grad  p),  ds  ■ dD.  = f F, 


ds 


Nee  formula  (53)).  Hence,  this  right-hand  side  is  equal  to  the  work 
u|  the  forces  of  pressure  during  the  motion  of  the  particle.  Thus, 
lormula  (55)  means  that  the  increment  in  the  total  energy  of  a particle 
equal  to  the  work  of  the  forces  acting  on  it. 

In  a stationary  flux,  the  law  of  conservation  of  energy  can  be 
■ ipplied  in  yet  another  way.  Consider  a stream  tube , which  is  a tube 
through  the  lateral  surface  of  which  the  flow  lines  neither  enter  nor 
leave.  Such  a tube  is  shown  schematically  in  Fig.  133.  The  walls 
may  be  solid  since  no  work  is  performed  on  them  and  the  fluid  does 
not  pass  through  them.  We  compare  the  power  (i.e.  the  energy  per 
unit  time)  of  a pump  required  to  deliver  fluid  to  the  tube  with  the 
power  of  a turbine  at  the  output  of  the  tube.  These  powers  must 
I Hi  equal  if  we  disregard  losses  to  friction.  In  unit  time,  v1S1  units 
of  volume  of  fluid  enter  through  the  entry  opening  into  the  tube, 
where  5 is  the  cross-sectional  area  of  the  tube  and  the  subscript  1 
indicates  input.  This  fluid  has  the  kinetic  energy 


Besides,  the  pressure  px  is  created  in  the  volume  v1Sl  by  the  power 
plvlS1  of  the  pump,  the  power  is  used  to  raise  the  liquid  from 

some  standard  altitude  2 = 0 to  an  altitude  zv  Setting  up  similar 
expressions  for  the  output  cross-section  and  equating  the  sums, 
we  get  the  following  notation  for  the  law  of  conservation  of 
energy : 

V2  V 2 

PV151  -f  + Pi^1S1  + = p v2S2  -j-  + p3v2S2  +pv2s2gz2  (56) 
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But  because  of  conservation  of  mass  and  by  virtue  of  the  incompres- 
sibility of  the  fluid,  we  have  vxSx  = v2S2 . Cancelling  this  quantity  out 
of  (56),  we  get 

+ Pi  + 9SZX  = + p2  + ?SZ2 

This  equation  is  equivalent  to  the  Bernoulli  integral  (55). 

From  the  Bernoulli  integral  it  is  quite  apparent  that  for  a constant 
or  slightly  variable  altitude,  the  pressure  in  the  stream  is  substantially 
dependent  on  the  rate  of  flow:  the  higher  the  rate,  the  smaller  the 
pressure.  Which  is  quite  natural:  if  a particle  passes  from  a region 
with  low  pressure  into  a region  with  higher  pressure,  then  the  resul- 
tant of  the  forces  of  pressure  acting  on  it  is  directed  counter  to  the 
velocity,  which  means  the  motion  is  hampered.  Now  if  the  pressure 
falls  with  the  motion,  then  the  pressure  “behind”  a particle  is  greater 
than  that  “in  front”,  and  the  particle  will  be  accelerated.  By  (55), 
when  a definite  velocity  has  been  attained,  the  pressure  should  be- 
come negative ; actually  this  does  not  occur,  the  liquid  loses  its  con- 
tinuity and  cavities  appear  (what  is  called  “cavitation”  sets  in). 

Exercise 

Using  formula  (51)  derive  an  invariant  (unrelated  to  any  choice 
of  coordinate  system)  definition  for  gradient  that  is  similar  to 
the  definition  (33)  of  divergence. 


ANSWERS  AND  SOLUTIONS 

Sec.  10.1 

It  is  easy  to  verify  that  the  relationship  between  the  old  coordi- 
nates x,  y,  z and  the  new  ones  x' , y' , z'  is  x = x' , y = y- 

z = y • Substituting  these  formulas  into  the  expression  for 

ut  we  get  u = x'2  + 2 y'2.  Since  z'  does  not  appear,  the  field 
is  plane-parallel. 


Sec.  10.2 

1.  Since  grad  u =yi  -\-x]  — 2zk}  it  follows  that  (grad  u)M  = i + 2 j + 

a 19 

+ 6k.  The  desired  derivative  is  equal  to • vradw  = — = 6.0. 

M 10 


2.  grad  u(M)  = -^- 
MyM 

Sec.  10.3 


m2m 

m2m 


F = — grad  U = — ayi  — a%j.  The  lines  of  force  are  deter- 
mined by  the  equation  — = — , whence  xdx  = y dy,  *2=y2  + C. 

y * 

These  are  hyperbolas  with  centre  at  the  origin  (Fig.  33). 
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Sec.  10.4 

b 

Q = 2tz  | p | (/(|  p |,  z)  dz,  where  z = a and  z = b are  the  planes 

a 

of  the  bases  of  the  cylinder.  In  the  case  of  a plane-parallel  field, 
/ does  not  depend  on  z and  Q = 2tt  | p |/(  | p |)  A,  where  h is  the 
height  of  the  cylinder. 


Sec.  10.6 


1.  Integrating  the  expression  for  dEx  indicated  in  the  text,  we  get 
(via  the  substitution  £ — z — ^xl  + y2  tan  a) 


- j 


[ix  d Z 


lx*  + y*  + (z  — W]3!2 


- d<x 


'l  \ix  Y x2  + y2 

Scos2a  fix  C 

— 1 V cos  a 

,„2  , ..2 ,1/i  1 J 


'*  (*2  -i-  y2)ip  ■ 


Similarly  we  find  Ey 


rfa  = -^- 
x 2 + y 2 


2t*y 

x2  + y 2 


, Ez  ~ 0f  whence 


£ = _ 2v* j , 2^  i __  2^p  _ Do 

A'2  + r*  *2  + :v2  | p |2  |p! 

The  result  coincides  with  (26). 

2.  Taking  the  charged  plane  for  the  yz-plane,  we  get  Ey  = Ez  = 0 
at  the  point  (x,  0,  0)  for  x > 0,  whereas 

OO  QO 


-J  5 

— OO  — <30 


(*2  + <jq2  + £2  )*/* 


drt  dr 


Passing  to  the  polar  coordinates  p,  9 in  the  73,  £ plane,  as  at 
the  end  of  Sec.  4.7,  we  get 

2rr  oo  oo 


=2,.* 


P<fp 


0 0 


(*2  + P2)3'2 


(*2  + p2)3/2 


= 271  v#  ' 


- 1 


t*2  + p2)l/2 


= 2tiv 


From  this  E = 2^vi,  which  coincides  with  (29). 
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Arguing  as  at  the  end  of  Sec.  10.6,  we  obtain  (for  the  natural 
meaning  of  the  letters) 


E = 2m  — ( p-|  = cos  a — 

<«UpN  il  p P 1 1 p I 

9 = 2m  Ti  (_  ln  I p I > = 2m  TT 

dl  | P | 


It  can  be  verified  that  the  traces  of  cylindrical  equipotential 
surfaces  and  also  the  electric  lines  of  force  are  located  as  in 
Fig.  130,  but  here  both  will  be  circles. 


Sec.  10.7 

1.  By  formula  (39),  div  A = 2,  div  B = 0.  Therefore  by  the  Ostro- 
gradsky  formula  the  desired  flux  is  equal  to 

^ div  A dQ,  = ^ 2 dQ,  = 2Q  = -j  nRst  ^ div  B dQ  — 0 

(O)  (O)  (H) 

2.  Since  A = f(r)  r°  = r,  it  follows  that  Ax  = 


\f'(r)  — x + /w]  r - f{r)  x 

dAx  L r \ 


. = £KX2+f{r) 


dx  r2 

In  similar  fashion  we  find  , whence 

dy  dz 


div  A = ( x 2 + y2  + z2)  + /M 


2>r2  — x2  — y2  — z2 


— f'(r)  + 


2 m 


3.  Reasoning  as  in  the  derivation  of  formula  (37),  we  obtain 

div  A = +T77/(IPl) 

Z7t  | p | (/  | p | | P | 

Here  div  A = 0 if  /( | p | ) = . This  is  the  field  (26)  con- 

I P I 

sidered  in  Sec.  10.6. 


Sec.  10.8 

— + div  (pv)  = /,  where  / = J(x,  y,  z,  t)  = ■ d-^ - and  dM  is  the 
dt  dQ,  dt 

mass  of  neutrons  entering  the  flux  in  the  volume  (dQ.)  during 

time  dt. 
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V(|p|). 


v<\  10.9 

'•  ?"(|p|)  + 

i p i 

l.  Partition  the  sphere  (£2)r  into  infinitely  thin  “bubbles”  of  radius 
s and  thickness  ds( 0 ^ s ^ r).  Then  by  formula  (46) 

r r 

^ = da^ds  = ^ 4tzs2(?(0)  ds  = -j  Tirzy(0) 

(Ci)r  0 («)•  0 

whence  follows  the  required  assertion. 

.1.  Let  0 be  the  origin.  For  reasons  of  symmetry,  we  have 

dc  = ^ y da  = ^ z da  — ^xy  da  = J xz  da  = ^ yz  da  — 0 

(all  integrals  are  taken  over  (a)r) ; besides,  ^ x2  da  = ^y2  da  — 

= ^ z2  da,  whence  ^ x2  da  — — ^ (x2  + y2  + z2)  da  = -i  r2  • 4nr2  =» 

= y 7ur4.  Applying  an  expansion  of  type  (27)  of  Ch.  4 with 
a = b = 0,  we  get,  to  within  fifth-order  infinitesimals. 


^<?do  = <p (0)  • 4rer2  + y tit4-  ± ( 

(o)r 

whence  formula  (47)  readily  follows. 


[ dx2 

^ dy 2 ^ 

dz2  j( 

Sec.  10.10 


grad  p(M)  = 


lim  -^L_ 

(AQ)-»Af  AH 


(57) 


Note  that  this  formula  holds  true  for  any  scalar  field  and  not 
only  for  the  pressure  field. 


Chapter  11 

VECTOR  PRODUCT  AND  ROTATION 


11.1  The  vector  product  of  two  vectors 

In  vector  algebra,  besides  the  multiplication 
of  a vector  by  a scalar  (Sec.  9.1)  and  the  scalar 
product  of  two  vectors  . (Sec.  9.2)  there  is  also 
defined  a vector  product  of  two  vectors,  which 
we  now  discuss. 

Recall  (Sec.  10.4)  that  a surface  in  space  is  said 
to  be  oriented  if  we  indicate  which  side  is  the 
outside  and  which  the  inside.  It  is  customary  to 
assume  that  if  this  surface  is  open  (has  a boun- 
dary), then  the  orientation  of  the  surface  also 
generates  an  orientation  of  its  contour,  that 
is  to  say,  a sense  of  traversal  of  the  contour  (boundary).  Conversely, 
if  the  sense  of  traversal  of  the  contour  is  indicated,  then  this  leads 
to  an  orientation  of  the  surface.  The  connection  between  the  orien- 
tation of  a surface  and  the  orientation  of  its  contour  is  indicated  in 
Fig.  134.  If  for  the  basis  of  a system  of  coordinates  we  take  the  right 
triad  of  vectors  i,  j,  k (such  that  when  looking  from  the  end  of  the 
third  vector  we  see  the  shortest  rotation  from  the  first  to  the  second 
counterclockwise),  then  we  use  the  rule  of  the  right-handed  screw, 
otherwise  we  have  the  rule  of  the  left-handed  screw.  For  example  the 
rule  of  the  right-handed  screw  can  be  stated  thus:  if  a right-handed 
screw  (which  is  the  one  usually  used  in  engineering  and  ordinary  life) 
is  rotated  in  the  sense  of  traversal  of  the  contour,  then  the  screw  will 
move  from  the  inside  of  the  surface  to  the  outside.  This  can  be  put 
differently:  if  a tiny  man  walks  on  the  outside  of  the  surface  along 
the  contour  (boundary)  in  the  indicated  direction  of  traversal  (see 
Fig.  134),  then  the  surface  itself  must  be  always  on  the  left.** 

• This  chapter  is  a direct  continuation  of  the  two  preceding  chapters  and  relies 
heavily  on  the  material  of  Secs.  9.1  and  9.2  and  Secs.  10.1-10.4  and  10.7. 
Besides,  we  make  use  of  the  multiple  integral  (Sec.  4.7),  determinants  (Sec.  8.3) 
and  the  delta  function  (Sec.  6.1).  In  Sec.  11.12,  use  is  made  of  the  material 
of  Secs.  10.5  to  10.9. 

**  The  first  three  items  of  Fig.  134  refer  to  the  right-handed  coordinate  system. 
From  the  drawing  it  is  clear  that  if  we  turn  the  right-handed  screw  from  i to  j, 
screwing  it  out  of  the  plate,  the  screw  itself  will  move  in  the  direction  of  k. 
The  traversal  of  the  contour  in  this  sense  corresponds  to  the  underside  of  the 
surface  (ct)  being  the  inside,  the  upper  side,  the  outside ; the  vector  <y  is  directed 
along  k. 
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Outside  Right-handed 


Sense  of 
traversal 


Outside  Left-Handed 


traversal 


I'ltf.  134 


Fig.  135 


Let  us  now  take  up  the  notion  of  a vector  product.  The  vector 
product  of  two  vectors  a and  b is,  by  definition,  the  vector  S of  an 
area  (S)  (see  Sec.  10.4)  that  is  obtained  if  a and  b are  referred  to  a 
single  origin,  a parallelogram  is  then  constructed  on  these  vectors, 
and  the  contour  is  traversed  beginning  with  the  first  vector  (i.e.  a; 
see  Fig.  135  where  the  rule  of  the  right-handed  screw  is  used;  this 
rule  will  always  be  employed  unless  otherwise  stated). 

To  summarize,  the  vector  product  of  two  vectors  a and  b is  a vec- 
tor directed  perpendicularly  to  the  two  vectors  and  with  modulus 
equal  to  the  area  of  a parallelogram  constructed  on  a and  b and 
forming  with  these  vectors  a triad  of  the  same  sense  (i.e.  right-handed 
or  left-handed)  as  the  vectors  i,  j,  k.  A vector  product  (also  sometimes 
called  cross  product)  is  denoted  by  axbor  [ab]. 

A few  of  the  most  important  properties  of  a vector  product  are : 
The  vector  product  of  two  nonzero  vectors  is  equal  to  the  zero  vector 
(null  vector)  if  and  only  if  the  vectors  are  parallel: 

axb  = 0 is  equivalent  tc  a jib 

since  parallelism  of  the  vectors  amounts  to  the  parallelogram  dege- 
nerating into  a line  segment  with  area  equal  to  zero.  In  particular, 
it  is  always  true  that  axa  = 0. 
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A vector  product  is  anticommutative  * 
bxa  = — (axb) 

Indeed,  if  the  order  of  the  factors  is  reversed,  the  parallelogram  re- 
mains unchanged,  but  the  contour  is  traversed  in  the  opposite  sense 
and  therefore  the  vector  of  the  area  is  reversed. 

It  can  be  verified  that  a scalar  factor  can  be  taken  outside  the 
sign  of  a vector  product: 

(Xa)  xb  = ax  (Xb)  = X(axb) 

and  that  the  distributive  law  holds : 

(a  + b)xc  = axe  + bxc,  cx(a  + b)  = cxa  + cxb 

When  multiplying  out  expressions  containing  a vector  product, 
watch  carefully  the  order  of  the  factors.  For  example, 

(a  + 2b)  X (2a  — 3b)  = 2a  X a — 3a  X b + 4b  x a 

— 6b  xb  = — 7axb 

Suppose  the  vectors  a and  b are  given  in  expansions  in  terms  of 
their  Cartesian  projections: 

& = axi  + aj  + az k,  b = bxi  + y + bje. 

Then,  taking  advantage  of  the  following  equations  (verify  them!), 
ixj  = k,  jxi  = — k,  jxk  = i,  kxj  = — i, 

kxi  = j,  ixk  = — j (1) 


we  get 

axb  = (ax\  + ay j + ajk)  X (bxi  + byj  + 62k) 

= axby k — axbz]  — aybx k + aybzi  + a2bxj  — azbj 

= i(«A  ~ aA)  + j(aA  - aA)  + k (axby  - aybx)  (2) 

(think  through  the  structure  of  the  last  expression). 

This  result  is  easy  to  remember  if  one  writes  it  in  the  form  of 
a determinant  (see  formula  (13)  of  Ch.  8): 


axb  = 


i j k 

^ X *z 

K \ K 


(3) 


Generally,  an  operation  on  any  entities  whatsoever  is  said  to  be  commutative 
if  it  is  independent  of  the  order  in  which  these  entities  are  taken.  The  multi- 
plication of  ordinary  numbers  and  the  scalar  multiplication  of  vectors  are  com- 
mutative. 
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Suppose  we  have  to  compute  the  area  S of  a parallelogram  con- 
structed on  the  vectors  a = 3i  — 2j  + k and  b = — 2i  + j + 4k^ 
Since  5 = |axb|,  we  calculate  as  follows: 

i j k 

S = axb  = 3-2  1 

-2  1 4 

= i(— 8 - 1)  + j(-2  - 12)  + k(3  -4)  =>  9i  - 14j  - k, 

S = | axb  | = f92  + 142  + 1*  = 16.7 

Since  in  this  example  the  vectors  a,  b are  nondimensional,  the  area  S 
is  nondimensional  as  well. 

Also  sometimes  used  is  the  scalar  triple  product  of  three 
vectors  a,  b,  c,  which,  by  definition,  is  equal  to  the  scalar  quantity 
(axb)*c.  The  geometric  meaning  of  this  product  is  evident  from 
Fig.  136: 

(a  x b)  • c = d • c = | d | cd  = \axb\  cd  = Sh  = V 

Thus,  the  scalar  triple  product  of  three  vectors  is  equal  to  the  volume 
of  a parallelepiped  constructed  on  these  vectors.  In  Fig.  1 36  the  vec- 
tors a,  b,  c form  a right-handed  triad  and  we  have  a volume  with  the 
plus  sign.  For  a left-handed  triad  the  angle  between  c and  d would 
be  obtuse ; in  this  case  (a  X b)  • c = — V.  (We  assume  here  the  rule 
of  the  right-handed  screw.)  A scalar  triple  product  is  equal  to  zero 
if  and  only  if  all  three  vectors  are  parallel  to  a single  plane,  because 
such  parallelism  means  that  the  parallelogram  has  degenerated  into 
a part  of  the  plane  and  thus  has  zero  volume. 

It  is  easy  to  obtain  an  expression  for  a scalar  triple  product  if 
the  expansions  of  the  factors  are  given  in  a Cartesian  system  of  coor- 
dinates. For  this  purpose  we  have  to  multiply  the  right  side  of  (2) 
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by  the  vector  c = cxi  + cy j + cz k in  the  customary  way  to  get  a scalar 
product  (via  formula  (5)  of  Ch.  9).  After  regrouping,  this  yields 

(axb)  • c = (aybz  - a,by)  cx  + (azbx  - axbz)  cy  + (axby  - aybx)  cz 

j ax  ay  j 

= «.( V*  - b*c>j)  - ay(b*c « ~ bzcx)  + - V*)  =|6*  bz  i 

ICr  Cj 

We  Will  also  need  a formula  for  the  vector  triple  product  ax(bxc) 
of  three  vectors.  To  derive  it,  suppose  that  we  have  chosen  the  coor- 
dinate axes  so  that  the  x-axis  is  along  the  vector  b and  the  y-axis 
lies  in  the  plane  of  the  vectors  b and  c.  Then  the  vector  b will  have  a 
projection  on  the  #-axis  alone,  or  b = bx i;  similarly,  c = cx\  + cyj, 
a = ax\  + ay j + az k.  Using  formula  (3),  we  get 


i j kj 

i 

j 

k 

bxc  = 

bx  0 0 = bxcy k,  a X (b  X c)  = 

<*x 

av 

az 

cx  cy  0; 

0 

0 

h*cy 

==  - “xbzCj 

This  result  is  inconvenient  in  that  it  is  “attached”  to  a special 
choice  of  coordinate  axes.  We  therefore  transform  it  (verify  this) : 

ax  (bxc)  = (axcx  + aycy)  bx i ~ axbx{cx i + fj) 

= b(a*  c)  — c(a-  b)  (4) 

This  formula  no  longer  contains  the  coordinate  projections  and  for 
this  reason  is  independent  of  any  choice  of  coordinate  system.  To 
remember  this  formula,  use  the  mriemonic  “abc  equals  bac  minus 
cab”. 

Exercises 

1.  Find  the  area  of  a triangle  with  vertices 
4(1,0,  — 2),  B(- 1,  1,  2),  C(  1,  3,  3) 

2.  What  kind  of  triad  is  formed  by  the  vectors 
a = 2i  — j,  b = 3i  + 2k,  c = — i — j + k ? 

3.  Compute  (ixi)  Xj  and  ix  (ix  j)  and  show  that  the  vector  produc 
does  not  possess  the  associative  property. 

11.2  Some  applications  to  mechanics 

A vector  product  is  particularly  convenient  for  describing  rota- 
tional motion  and  its  associated  notions.  Let  us  consider  the  rotation 
of  a rigid  body  about  a certain  axis  (Fig.  137)  with  angular  velocity 
Such  rotation  is  customarily  represented  by  the  angular  velocity 
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vector  co  located  on  the  axis  of  rotation  and  directed  in  accordance 
with  the  sense  of  rotation  via  with  the  chosen  rule  of  the  screw; 
in  Fig.  137  the  sense  of  to  is  chosen  by  the  rule  of  the  right-handed 
screw,  as  we  will  do  in  all  cases.  It  is  immaterial  where  precisely 
the  vector  co  is  taken,  for  it  is  a sliding  vector , that  is,  one  that  can 
be  arbitrarily  moved  along  an  axis  but  not  away  from  the  axis  .* 
We  assume  that  the  origin  0 is  chosen  at  any  point  on  the  axis 
of  rotation  and  we  seek  the  linear  velocity  v of  an  arbitrary  point 
M with  radius  vector  r (Fig.  137).  It  is  obvious  that  v is  perpendicular 
to  both  vectors  r and  co  and  is  therefore  perpendicular  to  the  entire 
parallelogram  (5)  constructed  on  these  vectors.  The  numerical  value 
of  v is  equal  to  the  product  of  co  by  the  shortest  distance  between  M 
and  the  axis  of  rotation,  which  is  precisely  the  area  of  the  indicated 
parallelogram.  But  these  conditions,  which  are  formulated  for  the 
vector  v,  are  satisfied  by  the  vector  product  coxr.  Thus 

v = oo  Xr  (5) 

(check  to  see  that  the  vector  product  must  be  taken  in  the  order  writ- 
ten and  that  the  right-hand  side  of  (5)  is  independent  of  any  specific 
choice  of  point  0 on  the  axis  of  rotation). 


Recall  that,  earlier,  the  point  of  application  of  a vector  was  not  fixed  and  it 
was  assumed  possible  to  have  a parallel  displacement  of  a vector  to  any  point. 
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Xb 

The  convenience  of  the  vector  of  angular  velocity  is  clearly  seen 
in  the  following.  Suppose  that  a body  experiences  two  rotations  at 
the  same  time  (generally,  nonparallel)  with  angular  velocity  vectors 
©!  and  <o2  the  axes  of  rotation  intersecting  at  the  point  0.  Then  by 
virtue  of  (5)  the  linear  velocity  of  any  point  M is 

v = Vj  + v2  = <ox  x r + co2  X r = ((o1  + ct>2)  X r = o X r 

where  €0  = 0)!  + co2.  Hence  the  body  rotates  with  angular  velocity 
<o  and  we  come  to  the  conclusion  that  in  a composition  of  rotations 
the  vectors  of  angular  velocity  combine  via  the  parallelogram  law. 
It  is  precisely  for  this  reason  that  we  can  call  the  angular  velocity 
a vector. 

Using  the  vector  product,  we  can  introduce  such  an  important 
notion  as  the  moment  of  an  arbitrary  vector  b with  origin  at  the  point 
M relative  to  any  point  O:  by  definition,  this  moment  is  equal  to 

r X b,  where  r = OM.  In  mechanics  one  mostly  has  to  do  with  the 
moment  of  a force  F,  that  is,  the  quantity  rxF,  and  the  moment 
of  the  momentum  wv,  that  is,  the  quantity  mx  X v. 

When  computing  the  moment  of  a vector,  the  vector  may  be 
regarded  as  sliding.  If  the  vector  b slides  along  itself,  then  this  means 
that  to  r is  added  Xb,  where  X is  a scalar.  However, 

(r  + Xb)  xb  = rxb  + (Xb)xb  = rxb  + 0 = rxb 

which  means  that  the  moment  of  the  vector  remains  unchanged 
under  such  a sliding  (Fig.  138).  But  if  the  vector  is  carried  away  from 
its  direction,  the  moment  changes. 
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Let  us  consider  a system  of  particles  connected  in  some  manner, 
each  of  which  has  a constant  mass  mi  and  a (generally  variable) 
radius  vector  r*  — rft).  Suppose  each  of  these  points  is  acted  upon 
by  distinct  forces;  we  denote  the  resultant  of  all  “internal”  forces 
(the  forces  of  interaction  between  the  points  of  the  system)  applied 
to  the  ith  particle  by  Fjn  and  the  resultant  of  all  “external”  forces 
by  F?x.  The  peculiarity  of  the  internal  forces  is  that,  on  the  basis 
of  Newton's  third  law  (“to  every  action  there  is  an  equal  and  oppo- 
site reaction”),  for  every  internal  force  there  is  equal  and  an  opposite 
internal  force  and  such  — this  is  very  important  — as  to  be  located 
on  the  extension  of  the  first  force.  For  this  reason  the  sum  of  all 
internal  forces  and  also  the  sum  of  their  moments  about  any  point 
are  equal  to  zero. 

The  basic  equations  of  motion  of  a system  of  particles  are  obtained 
if  we  write  Newton's  second  law  (“force  equals  mass  times  accele- 
ration”) : 

mt  — - = Ff*  + Fin  (6) 

Summing  these  equations  over  all  particles,  we  get 

E*«-5r=£Fr  + £Fin  = EFr  (?) 

i at*  i i i 

since  the  sum  of  all  internal  forces,  as  has  already  been  mentioned, 
is  equal  to  zero.  It  is  convenient  to  introduce  a point  C with  radius 
vector 

tc  = miri>  where  M = mt 

M % i 

is  the  total  mass  of  the  system.  This  point  is  called  the  centre  of  mass 
of  system  at  hand.  In  this  notation,  equation  (7)  can  be  rewritten  as 

— E m?i  - or  M F-x 

dfi  * ‘ i ' dt*  i 

Thus,  the  centre  of  mass  moves  as  if  it  possessed  the  total  mass  of 
the  system  and  were  acted  upon  by  a force  equal  to  the  sum  of 
all  external  forces.  In  particular,  if  external  forces  are  absent,  then 
the  centre  of  mass  of  a system  is  in  rectilinear  and  uniform  motion, 
tc  = nt  + b. 

Now  let  us  take  up  moments.  If  both  sides  of  (6)  are  multi- 
plied vectorially  on  the  left  by  rit  we  get 

X^  = rixF'iHr,X  Fj“  (8) 


Take  advantage  of  the  equation 

d I _ w dti\  d*i  w dri  , 


• ) = — 
dt 


+ r<X- 


396 


CH.  11 


Vector  product  and  rotation 

which  follows  from  the  general  formula  of  the  derivative  of  a vector 
product.  This  formula,  — (A  x B)  = — X B + A x — » is  derived  in 

dt  dt  dt 

exactly  the  same  way  as  the  similar  formula  (8)  of  Ch.  9 for  a 
scalar  product.  From  this  we  can  rewrite  (8)  thus 

- (r*  x m^i)  = r(x  Ff%  + r*  x Fjn 

at 

Summing  these  equations  with  respect  to  all  i,  we  get 

7 13  fo  X m (y()  = r4  X Ff  + £ r<  X Fj"  = £ r«  X Ff  (9) 

at  i t i i 

since  the  sum  of  the  moments  of  all  internal  forces  is  zero.  The  sum 

G = 22r ixmi\i  (10) 

i 

of  the  angular  momenta  of  all  particles  comprising  a system  is  termed 
the  angular  momentum  of  this  system  with  respect  to  the  same  point 
O,  relative  to  which  all  the  angular  momenta  are  taken.  The  sum 

L =Er<xFf 

i 

of  the  moments  of  all  external  forces  acting  on  the  system  is  called 
the  total  moment  of  the  external  forces.  Thus,  formula  (9)  can  be 
rewritten  as 


which  is  to  say  that  the  rate  of  change  of  the  angular  momentum  of 
a system  is  equal  to  the  total  moment  of  the  external  forces  acting  on 
the  system.  In  the  special  case  of  the  absence  of  external  forces  or  if 
their  total  moment  is  equal  to  zero,  we  find  that  the  angular  momen- 
tum of  the  system  remains  constant. 

Exercise 

When  is  the  moment  of  a vector  b about  a point  0 equal  to 

zero? 

11.3  Motion  in  a central-force  field 

Suppose  a particle  is  in  motion  acted  upon  by  a force  varying 
in  arbitrary  fashion  but  all  the  time  directed  towards  (or  away 
from)  the  coordinate  origin  0.  This  is  called  a central-force  field.  If 
at  the  initial  time  we  pass  a plane  ( P ) through  the  origin  and  through 
the  velocity  vector,  the  point  in  its  motion  will  not  leave  the  plane 
since  there  are  no  forces  capable  of  carrying  the  point  out  of  ( P ). 
(The  plane  ( P ) coincides  with  the  plane  of  Fig.  139.)  Since  the 
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Fig.  139 


external  force  is  directed  along  a radius  vector,  its  moment  is  equal 
to  zero  and  for  this  reason  the  angular  momentum  of  the  point, 

rxw  -i  remains  constant  all  the  time.  However,  the  modulus  of 
dt 

the  vector  rxdr  is  equal  to  double  the  area  of  the  triangle  (dS) 
shown  in  Fig.  139  (why  is  this?).  For  this  reason,  as  the  point 
moves,  the  derivative  dSjdt  remains  constant,  which  means  that  S 
varies  linearly  with  respect  to  time.  We  thus  arrive  at  Kepler’s 
second  law:  in  the  case  of  a central  force,  the  area  described  by 
the  radius  vectors  of  a planet  in  equal  times  are  equal.  The  other 
laws  of  Kepler  (to  be  discussed  later  on)  make  essential  use  of  the 
specific  form  of  dependence  of  force  on  the  length  of  the  radius  vec- 
tor.* 

Let  F = — F(r)  r°.  By  virtue  of  Sec.  10.2,  such  a field  has  a 
potential  U(r ),  where  the  function  U(r)  is  the  primitive  of  F(r ),  i.e. 

— =F(r).  Let  us  introduce  the  polar  coordinates  r,  9 in  the  (P) 

dr 

plane.  Then  the  law  of  motion  of  a particle  will  be  determined  by  the 
relations  r — r(t)f  9 = 9 (2). 

To  find  these  relations,  we  take  advantage  of  two  conservation 
laws:  that  of  angular  momentum  (Sec.  11.2)  and  of  total  energy 
(Sec.  10.3).  For  this  purpose,  denote  by  s°  the  vector  obtained  from 
r°  by  a rotation  through  90°  in  the  positive  sense.  Then 

— = ^2  s° 

dt  dt 


Kepler  discovered  his  law  empirically  in  studying  the  motions  of  the  planets  of 
the  solar  system.  It  was  Newton  who  demonstrated  that  these  laws  are  conse- 
quences of  a definite  type  of  gravitational  force  with  which  a central  body,  the 
sun,  acts  on  the  planets. 
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(This  rather  obvious  formula  is  a special  case  of  formula  (14)  of 
Ch.  9 and  actually  coincides  with  formula  (10)  of  Ch.  5.)  From  this 
we  have 


Since  the  vector  r°xs°  = P°  is  perpendicular  to  the  (P)  plane 
and  has  a constant  modulus  equal  to  1,  the  law  of  conservation  of 
the  angular  momentum  can  be  written  as 


(=  constant) 


(13) 


The  law  of  conservation  of  total  energy  becomes,  by  virtue  of  (12), 

f [(f  r + ('  f n + = e <= constani>  < m> 

Here  the  constants  G and  E are  determined  by  the  initial  conditions. 
Express  — from  (13)  and  substitute  it  into  (14)  to  get  the  first- 

dt 

order  differential  equation 

f^F+b'+^H  (15) 

for  r(t):  having  found  r(^),  we  can  obtain  <p(/)  from  equation  (13). 
Equation  (15)  is  quite  analogous  to  the  equation 

of  the  motion  of  particle  m along  the  #-axis  under  the  action  of  a 
force  with  potential  u(x).  Let  us  recall  the  simple  properties  of  the 
solutions  of  (16)  which  are  considered,  for  example,  in  HM,  Sec.  6.8. 
Suppose  that  the  graph  of  the  potential  u(x)  is  of  the  form  given  in 
Fig.  140  so  that  w(oo)  = u 00.  Then  for  umia  < E < say  for 
E = EX  in  Fig.  140,  the  particle  will  oscillate  periodically  having  a 
and  b as  cusps ; we  say  that  the  motion  is  finite . The  period  of  oscilla- 
tions, as  can  readily  be  derived  from  (16),  is 

b 

T = \[2m  [ — — (17) 

)VE-u(x) 

a 


But  if  E ^ ux,  say  for  E = E2  in  Fig.  140,  the  particle  moving 
leftwards  will  reach  point  c and  turn  back,  but  in  its  rightward 
motion  it  will  go  off  to  infinity,  and  the  motion  becomes  infinite. 
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Equation  (15)  takes  the  form  (16)  if  we  write  x instead  of  r 
(which  is  inessential)  and  set 

u(r)  = U{r)l+^-2  {0<r<  oo)  (18) 

Thus,  besides  the  initial  potential  U(r)  there  is  added  a centrifugal 
G2 

potential  , which  depends  on  the  initial  conditions,  more  exactly, 

2 mr2 

on  the  angular  momentum  G of  the  system  at  hand. 

And  so  if  um in  < E < (=  E/®),  the  relation  r(t)  will 

be  periodic.  During  each  period  of  variation  of  r the  polar  angle 
receives  one  and  the  same  increment  A 9.  Since  A 9 is,  generally, 
incommensurable  with  27t,  the  trajectory  will,  as  a rule,  have  the 
form  of  a rosette  (Fig.  141)  and  in  subsequent  motion  will  every- 
where fill  in  the  annulus  between  the  circles  r = rmin  and  r = rmax. 

However,  for  the  two  most  interesting  types  of  central  forces 
that  we  now  proceed  to  discuss,  the  trajectories  turn  out  to  be  closed 
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paths  without  self-intersections  — what  is  more,  they  are  simply 
ellipses ! In  the  first  example  we  assume  that  the  force  is  inversely 
proportional  to  the  square  of  the  distance  of  the  particle  from  the 
origin.  Such  is  the  law  of  motion  of  celestial  bodies  (Newton's  law) 
or  of  electrically  charged  particles  (Coulomb's  law)  if  the  central 
body  that  gives  rise  to  the  force  of  attraction  can  be  regarded  as 
at  rest  (we  will  go  into  this  condition  later  on). 

In  the  example  at  hand,  F{r)  = — » U(r)  = > and  so 

r2  r 

by  (18) 


u(r)  = 


k 

Y 


G 2 

2 mr2 


Fig.  142  shows  the  graph  of  this  effective  potential  for  A > 0 (as  we  shall 
assume  it  to  be  from  now  on)  and  for  distinct  values  of  G.  It  is  clear 
that  if  G =£  0,  i.ei  if  the  motion  does  not  degenerate  into  rectilinear 
motion,  then  the  centrifugal  potential  as  builds  up  faster  than 
the  attractive  potential,  and  so  u(+0)  = oo.  This  means  that  a 
moving  particle  cannot  fall  onto  the  central  body,  at  least  if  we 
assume  it  to  be  sufficiently  small  and  thus  if  we  disregard  the 
possibility  that  the  particle  will  “fly  into"  it  by  passing  too  close  to 
the  origin. 
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Since  — - = — — > the  minimum  of  the  effective  potential 

dr  r 2 tnrz 

is  attained  when 

k G2  n i ~~  G2  , mk2 

= o,  whence  r = r = — and  nmm  

r2  rnr 3 Ynk  2 G2 

G2 

I fence  the  solution  r = — is  possible ; corresponding  to  this  so- 

mk 

lution  is  particle  motion  in  a circle  with  centre  at  the  origin.  The 
corresponding  angular  velocity  is  determined  from  equation  (13): 


It  is  constant,  i.e.  we  obtain  a uniform  revolution  of  the  particle 
about  the  central  body.  If  a gravitational  field  is  considered,  then 
k = xmM,  where  x is  the  gravitational  constant  and  M is  the  mass 
of  the  central  body.  From  this,  by  (19),  we  get  the  period  of  re- 
volution : 


T = = 2tt  ]/ — - = ^L73/2 

dy/dt  \ y.mM  1/y.M 

This  is  Kepler  s third  law : the  squares  of  the  periods  of  revolution 
of  the  planets  about  the  sun  are  proportional  to  the  cubes  of  their 
distances  from  the  sun.  Here  the  proof  is  given  only  for  circular 
orbits,  but  with  a proper  refinement  of  the  statement  it  proves  to  be 
valid  for  all  orbits  (see  Exercise  1). 

To  determine  the  form  of  noncircular  paths  for  G =£  0,  substi- 
tute into  (15)  U(r)  = — kjr  and  complete  the  square  to  get  (verify 
this !) 

<“> 


[2  ( mk2  IV2 

— \E  + -^-1  . Here  we 

take  into  account  the  inequality  E > (what  does  it  follow 

from?).  By  (20)  we  can  set 

— — — = q cos  ipt  — = q sin  where  ^ (21) 

G mr  dt 


Differentiating  the  first  equation  of  (21),  with  respect  to  t,  we  get 
(G/mr2)  --  = — q sin  — , whence,  taking  into  account  the  second 

dt  dt 

equation  of  (21)  and  also  (13),  we  find  that  ~ = — — , or  ^ = a — 9 

dt  dt 


26  — 1634 
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Fig.  143 


where  a = constant  is  determined  by  the  initial  conditions.  From 
the  first  equation  of  (21)  we  get 

r = — \ 1 — — cos  (a  — <p)l  = p\\  — e cos  (<p  — a)]-1  (22) 

km  [ k J 

where 

, G2  _ Gq 

p = • — - y e = — 
km  k 

From  (22)  it  is  apparent  at  once  that  for  e < 1 the  trajectory  is 
close  to  a closed  path  (Fig.  143),  whereas  for  e ^ 1 it  goes  off  to 
infinity.  It  is  easy  to  verify  that  the  condition  e < 1 is  equivalent 
to  the  inequality  E < 0;  thus  (see  Fig.  142)  the  closed  lines  serve  as 
paths  of  bounded  motions.  To  be  more  exact,  the  equation  (22)  for 
e<  1 defines  an  ellipse  with  focus  at  the  origin  and  major  axis  in- 
clined at  an  angle  a to  the  polar  axis.  (We  leave  it  to  the  reader  to 
prove  this  assertion  by  going  over  to  Cartesian  coordinates  via  the 
formulas  x'  = p cos(9  — a),  y9  — p sin(<p  — a).)  Thus,  the  trajecto- 
ries of  bounded  motions  are  ellipses  with  one  of  the  foci  in  the  central 
body ; therein  lies  Kepler  s first  law . In  similar  fashion  it  can  be  veri- 
fied that  for  e > 1 the  motion  occurs  along  a hyperbola,  and  in  the 
boundary  case  of  e = 1,  along  a parabola. 

It  is  also  possible  to  prove  that,  conversely,  from  the  Keplerian 
laws  follows  the  law  of  gravitation  F(r)  = k[r2.  For  this  reason,  mo- 
tions that  obey  such  a law  are  termed  Keplerian . 

Up  to  this  point  we  have  assumed  the  central  body  to  be  firmly 
“attached”  to  the  coordinate  origin.  The  motion  of  the  central  body 
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can  be  neglected  if  its  mass  M is  substantially  greater  than  the  moving 
mass  m , M m.  It  turns  out,  though  we  will  not  give  the  proof 
here,  that  the  case  of  comparable  masses  M and  m leads  to  what 
we  have  already  discussed,  but  the  second-order  curves  along  which 
both  masses  move  will  then  have  the  focus  at  the  centre  of  mass  of 
the  whole  system.  It  is  interesting  to  note  that  the  distribution,  be- 
tween the  two  bodies,  both  of  the  values  of  the  kinetic  energy  and  also 
the  angular  momenta  about  the  centre  of  mass  is  inversely  propor- 
tional to  the  masses  of  the  bodies.  Indeed,  for  the  sake  of  simplicity, 
let  us  confine  ourselves  to  the  case  of  bodies  moving  in  circles  cen- 
tred at  the  centre  of  mass  0 with  constant  angular  velocity  co  and 
denote  by  d the  (constant)  distance  between  the  bodies  (Fig.  144). 
Then  the  kinetic  energy  and  the  angular  momentum  of  the  mass 


M are  equal,  respectively,  to  —l  ^md  1 

2 \M  + m) 


for  the  mass  m these 


M + m 
quantities 


and 


md  M(amd 


M -f  d M + m 


whereas 


are 


Md  rntoMd 


i , m(  aMd  \2 

equal  to  — and 

2 [m  + m) 


M + m M ~r  m 


2 \M  m) 

whence  the  inverse  proportionality  follows. 


We  leave  it  to  the  reader  to  consider  the  case  of  repulsion  via 
Coulomb's  law,  i.e.  k < 0.  Note  however  that  for  G =£  0 the  trajec- 
tories are  hyperbolas. 

As  another  instance  of  a central  force  we  consider  the  force  F = 
= — kr(k  > 0)  that  is  proportional  to  the  distance  of  the  particle 
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Fig.  145 


k . . 

from  the  origin.  Here  the  potential  U(r)  = — r2.  This  potential  can 
be  written  in  terms  of  Cartesian  coordinates  as 

u = | (X2  + y*  + z'-) 

By  virtue  of  Sec.  4.6,  such  is  the  form  of  the  principal  (quadratic) 
portion  of  the  expansion  of  the  potential  U in  powers  of  #,  y,  z 
about  the  point  of  its  minimum  in  the  isotropic  case,  i.e.  when  the 
equipotential  surfaces  are,  to  within  infinitesimals  of  higher  order, 
not  ellipsoids  but  spheres.  Therefore  the  potential  under  considera- 
tion describes  small  oscillations  of  a particle  about  an  arbitrary  posi- 
tion of  its  stable  equilibrium  (without  singularities  of  the  potential) 
in  an  isotropic  case.  A case  of  this  kind  is,  for  example,  realized  in 
the  scheme  of  Fig.  145  if  all  the  springs  have  the  same  rigidity.  From 
Sec.  10.6  it  follows  that  such  a potential  arises  also  inside  a homoge- 
neous attractive  sphere.  We  can  imagine  that  the  particle  is  moving 
inside  a channel  cut  in  accord  with  the  computed  path  in  the  sphere, 
the  channel  being  so  narrow  that  its  effect  on  the  gravitational  field 
can  be  disregarded. 

If  for  the  yy-plane  we  take  the  plane  in  which  the  particle 
moves,  then  the  differential  equation  of  motion  in  this  instance  can 
be  written  thus: 

m — (xi  + yj)  = — krr°  = —kr  ~ —k{xi  + yj) 
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• •I,  in  terms  of  projections, 

m — + = 0,  m — + ky  = 0 

rf/2  dt*  J 

lluis,  in  Cartesian  coordinates  the  variables  are  separated,  i.e.  the 
oscillations  along  both  axes  occur  independently  of  one  another.  By 
s<’<\  7.3,  the  law  of  motion  becomes 

x — cos  (tot  + 9O,  y = r 2 cos  (co/  + <p2)  (23) 

where  co  = } (kjm  and  the  constants  rv  cplt  r2,  cp2  are  determined  by 
l he  initial  conditions.  We  leave  it  up  to  the  reader  to  be  convinced 
l hat  the  equations  (23)  determine  the  particle  trajectory  as  being 
an  ellipse  with  centre  at  the  origin. 

Thus,  for  the  force  F(r)  ~ kr  all  paths  again  turn  out  to  be  closed, 
and  the  period  In/to  = iTz^mjk  of  oscillations  of  the  particle  does 
not  depend  on  the  initial  conditions.  From  this  it  follows  that  if, 
say,  we  dug  a well  through  the  centre  of  the  earth  and  pumped  out 
all  the  air,  then  the  oscillation  period  of  a stone  thrown  into  the  well 
would  not  depend  on  the  amplitude  and  would  prove  to  be  equal 
to  the  period  of  revolution  of  a satellite  at  the  earth's  surface  (air 
resistance  disregarded). 

Of  course,  in  the  second  example  above,  the  closed  nature  of  the 
path  was  due  to  the  fact  that  the  variables  were  separated  in  the 
Cartesian  axes  and  the  oscillations  along  both  axes  were  of  the  same 
period.  But  the  surprising  and  remarkable  thing  is  that  in  the  first 
example,  where  the  variables  are  not  separated,  the  paths  proved 
to  be  closed  too ! This  is  something  in  the  nature  of  a mathematical 
marvel.  Indeed,  the  form  of  the  lawF  = kjr 2 was  obtained  in  Sec.  10.7 
directly  from  the  condition  div  F(r)  = 0 (r  0),  which  is  natural 
from  the  standpoint  of  physics.  In  contrast,  the  closed  nature  of  the 
paths  under  such  a law  of  attraction  is  not  at  all  obvious  and  was 
proved  with  the  aid  of  artificial  mathematical  computations.  This 
"marvel”  substantially  simplified  the  measurement  and  thus  the 
analysis  of  comparatively  small  deviations  of  the  actual  trajectories 
from  elliptic  form  (for  instance,  the  motion  of  the  perihelion  of  Mer- 
cury; as  it  will  be  recalled,  the  explanation  of  this  motion,  which 
follows  from  the  general  theory  of  relativity,  served  as  one  of  the 
first  decisive  confirmations  of  this  theory). 

Exercises 

1.  Find  the  oscillation  period  of  a particle  about  ellipse  (22)  and 
establish  the  relation  of  this  period  to  the  semimajor  axis  of  the 
ellipse. 

2.  Find  the  period  of  variation  of  the  radius  of  a particle  for  the  law 
of  attraction  F = — kr  with  the  aid  of  formula  (17)  and  explain 
the  discrepancy  with  the  result  obtained  in  this  section. 
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1 1.4  Rotation  of  a rigid  body 

Consider  the  rotation  of  a rigid  body  (£2)  about  a fixed  axis 
(the  2-axis).  The  angular  momentum  G of  this  body  about  some 
point  0 on  the  axis  of  rotation  is  readily  connected  with  the  angular 
velocity  co  = cok  of  rotation  via  formula  (5) : 

G = j rxviw=  j rx  (wk x r)  p dQ  = oo  j pr x (k x r)  dQ 

(H)  (Q)  (Q) 

where  p denotes  the  (generally  variable)  density  of  the  body.  Recab 
ling  that  r = xi  + jyj  + ^k,  we  find,  by  (4), 

rx  (kxr)  = (r*  r)  k— (r-  k)  r = (x2  + y2  + z2)  k—  [z(xi  + yj£+  2k) 

= — xzi  — yz j + (x2  + y 2)  k 

whence 


Gx  = — ^ p xz  dQ,  Gy  — — co  ^ pyz  dQ, 

(n)  (O) 

Gz  = co  ^ p(x2  -f-  y2)  dQ 
ici) 

The  last  integral,  (Q) 

Iz  = ^ p(*2  + y2)  dQ  (24) 

(O) 

is  called  the  moment  of  inertia  of  the  body  (£1)  about  the  2-axis,  so 
that  Gz  = co/z. 

Suppose  the  body  (£))  is  acted  upon  by  certain  specified  forces 
F*x  and  the  forces  F*-up  of  reaction  in  the  supports,  which  are  not 
given  beforehand.  If  the  latter  forces  are  applied  to  the  axis  of  rota- 
tion, then  their  moment  rsupxF-up  = 2supkxF-up  is  perpendicular 
to  the  2-axis.  To  avoid  considering  these  unknown  forces,  let  us 
project  (11)  on  the  2-axis  to  get 

— = = Lz  (25) 


In  the  formation  of  the  right  side,  i.e.  the  projection  of  the  total  mo- 
ment of  the  external  forces  on  the  axis  of  rotation,  we  have  to  take 
into  account  only  the  specified  forces  F*\  We  have  thus  passed  from 
the  vector  equation  (11)  to  the  scalar  equation  (25). 

Let  us  consider  for  instance  the  case  where  the  external  forces 
F*x  are  absent  or  are  directed  parallel  to  the  axis  of  rotation.  Then 
Lz  — 0 and  from  (25)  we  get  <o/2  = constant.  If  the  body  does  not 
change  under  these  circumstances,  then  Iz  = constant,  and  for  this 
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reason  — constant,  which  means  the  angular  velocity  of  rotation 
remains  unchanged.  (Here  we  disregard  the  friction  at  the  points  of 
support,  which  friction  leads  to  an  additional  term  in  the  right-hand 
member  of  (25)  and,  as  a consequence,  to  a damping  out  of  the  velo- 
city of  rotation.)  But  if  Iz  varies,  then  co  varies  in  inverse  propor- 
tion to  Iz.  A dancer  uses  this  property  when  pivoting  on  one  leg. 
She  begins  the  rotation  with  outstretched  arms  and  then  quickly 
brings  them  together  (usually  downwards  to  the  initial  position). 
In  this  way,  by  virtue  of  (24),  she  reduces  the  Iz  of  her  body  (very 
substantially,  since  outstretched  arms  make  an  appreciable  contri- 
bution to  Iz)  and  correspondingly  increases  co.  (As  A.  Ya.  Vaganova 
has  stated  in  her  Principles  of  Classical  Dancing,  “this  motion  of  the 
hands  produces  the  extra  force  needed  for  the  turn”.)  At  the 
same  time  the  dancer  rises  onto  her  toes  to  reduce  friction.  After  a 
few  turns  she  again  extends  her  arms,  reducing  co,  and  comes  to  a 
halt  on  both  feet. 

Now  let  us  examine  a gyroscope  (actually  a top),  i.e.  a rigid 
body  that  rotates  about  a fixed  point,  which  we  take  for  the  origin 
of  coordinates.  It  is  assumed  that  the  gyroscope  has  an  axis  of  symme- 
try passing  through  the  point  of  support  and  rapidly  rotates  about 
that  axis,  frictional  forces  being  negligibly  small. 

Note  that  if  a rigid  body  is  in  rotation  about  an  axis  of  symme- 
try that  passes  through  the  origin,  then  the  angular  momentum  of  this 
body  is  directed  along  the  axis  of  rotation  and  is  proportional  to  the 
angular  speed  of  rotation.  Indeed,  the  first  follows  at  least  from 
reasons  of  symmetry,  the  second,  directly  from  the  definition  (10). 
On  the  other  hand,  in  mechanics  proof  is  given  that  gravitational 
forces  can  be  replaced  by  a resultant  passing  through  the  centre  of 
gravity  of  the  body,  which  clearly  lies  on  the  axis  of  symmetry. 
Therefore  if  a gyroscope  is  attached  at  the  centre  of  gravity  or  if 
it  is  set  in  motion  in  the  vertical  position  (in  both  cases  the  moment 
of  the  force  of  gravity  is  zero),  then  from  (11)  it  follows  that  the  an- 
gular momentum  and,  with  it,  the  axis  of  rotation,  remain  constant. 
In  the  case  of  brief  thrusts  the  right-hand  side  of  (11)  can  for  short 
times  become  nonzero.  Then  G receives  a certain  increment,  but  the 
greater  the  angular  velocity,  that  is,  the  greater  | G |,  the  smaller  is 
the  relative  value  of  this  increment,  i.e.  the  more  stable  is  the  gyro- 
scope. The  use  of  the  gyroscope  in  engineering  is  based  on  this  pro- 
perty. 

Now  let  us  see  what  will  happen  if  the  centre  of  gravity  is  located 
above  the  point  of  attachment  (as  in  the  case  of  a child's  top),  and 
the  axis  of  rotation  is  nonvertical.  Then  the  force  of  gravity  P applied 
to  the  centre  of  gravity  C creates  a tilting  moment  rcxP  (Fig.  146). 
By  equation  (11),  the  angular  momentum  G will  in  time  dt  receive 
an  infinitesimal  increment  dG  = (rcxP )dt.  Since  this  infinitesimal 
vector  is  horizontal  and  perpendicular  to  the  vector  G,  it  follows  that  the 
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Fig.  146 


vector  G -j-  dG  is  obtained  from  G by  a rotation  about  the  vertical  axis 
through  an  infinitesimal  angle.  This  means  that  the  numerical  value 
of  the  angular  velocity  does  not  change  but  the  axis  of  rotation  of 
the  gyroscope  will  turn  about  the  vertical  axis.  The  same  reasoning 
can  be  applied  to  the  new  position  and  we  see  that  because  of  the 
force  of  gravity  the  axis  of  symmetry  of  the  gyroscope  receives  a 
supplementary  uniform  rotatory  motion  (called  precessional  motion) 
about  the  vertical  axis.  The  rate  of  precession  is  the  smaller,  the 
greater  is,  since  for  large  G the  same  d G implies  a rotation  through 
a smaller  angle. 

The  foregoing  conclusions  are  not  quite  exact  because,  due  to 
precession,  the  angular  momentum  does  not  lie  exactly  along  the 
axis  of  the  gyroscope  but  deviates  from  it.  That  is  why  the  motion 
of  a gyroscope  is  actually  more  complicated  than  just  described. 
But  this  correction  is  slight  as  long  as  the  angular  velocity  is  great. 
If  the  gyroscope  is  not  kept  spinning  at  the  same  rate,  the  unavoid- 
able and  constant  friction  reduces  the  angular  velocity  and  the  rate 
of  precession  increases,  and  when  these  rates  become  comparable, 
the  nature  of  the  motion  changes  appreciably  and,  finally,  the  top 
falls  to  a halt. 

Exercise 

Compute  Jz  for  a homogeneous  (a)  cylinder  of  radius  R and 
mass  M with  Oz  as  the  axis  of  rotation,  (b)  rectilinear  rod  of 
length  L and  mass  m if  the  z- axis  serves  as  a perpendicular  to 
the  midpoint  of  the  rod. 

1 1.5  Symmetric  and  antisymmetric  tensors 

The  angular  momentum  of  a rigid  body  is  a good  domain  for 
applying  the  concept  of  a tensor  (Sec.  9.5).  We  will  use  the  tensor 
notation  given  in  Sec.  9.5.  Let  the  origin  0 be  fixed  but  let  the  coordi- 
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n.ite  axes  be  chosen  in  a variety  of  ways.  From  Sec.  11.4  it  is  clear 
I hat  when  considering  the  angular  momentum  of  a body  ft  an  essen- 
i ul  role  is  played  by  the  integrals 


(-  ^ pxtXj  dQ  ( i j=j), 

(Cl) 

J ?(xkxk-  x(Xj)da  (i  = j) 

' (Cl) 


(26) 


It  is  easy  to  verify  that  the  quantities  (26)  transform,  under  a 
change  of  coordinate  basis,  via  the  tensor  rule  (23)  of  Ch.  9 and  for 
I his  reason  constitute  a tensor  of  rank  two : 


I = V,e,  (27) 

This  is  called  the  inertia  tensor  of  the  body  (ft)  with  respect  to  the 
point  0.  To  prove  this,  write  the  quantities  I{j  as  a single  formula: 

^ij  = ^ p{^ijXkXJc  XiXj) 

(O) 


where  the  Kronecker  delta  8^  is  defined  for  any  choice  of  basis  by 


0 

1 (i  = j) 


We  leave  it  to  the  reader  to  convince  himself  that  the  quantities  8tJ 
form  a second-rank  tensor  (called  the  unit  tensor).  Now  if  the  basis  is 
replaced  by  the  formulas  (19)  of  Ch.  9,  the  quantities  xt  transform 
(as  projections  of  the  vector  r)  via  the  formulas  x\  = cc^x^  whence, 
applying  the  equation  x'kx'k  = xkxk  (where  does  it  follow  from?),  we  get 

^ ij  = ^ p{^ijXkXk  XiXj)  ~ ^ P(airCCjlKlXkXk  * ocirCcjlXrXl)  ^£1 
(O)  (Q) 

~ aira/Z  ^ P ( ^rl  XkXk  XrXl ) = air°Sz^rZ 

(O) 

which  completes  the  proof. 

Now  let  the  body  (ft)  be  in  rotation  with  angular  velocity  co 
about  an  axis  passing  through  the  origin  0.  We  will  show  how  the 
corresponding  angular  momentum  G about  0 is  expressed.  Reasoning 
as  in  Sec.  11.4  and  using  formula  (4),  we  obtain 

G = ^ pr  x (co  X r)  rfft  = ^ p[(r  • r)  to  — (r  • o)  r]  dQ. 

(«)  (n) 

— ^ Xj^jX{^i1  dQ. 

(O) 


(28) 
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Fig.  147 


In  order  to  take  and  ei  outside  the  brackets  under  the  integral 
sign,  write  <o4  = (think  through  the  meaning  of  this  formula!). 
This  yields 

G = ^ ?(Si}xkxk  — x^)  toft  dQ  = lijcofr 

(Q) 

or,  to  put  it  differently, 

G*  = (29) 

(We  can  also  write  G = I • ©,  where  the  scalar  product  of  a second- 
rank  tensor  by  a vector  is  defined  by  (a*  b)c  = a(b*  c)  = (b*  c)  a, 
whence 

1*0=  ■ ek)  = = 1^ co^ei  = G 

Actually,  formula  (29)  coincides  with  the  formulas  of  Sec.  11.4, 
but,  unlike  the  situation  in  Sec.  11.4,  it  is  not  connected  with  a spe- 
cial choice  of  Cartesian  axes.  In  particular,  it  shows  that  with  the 
exception  of  the  spherically  symmetric  case  (where  I{j  = I Sf}),  the 
directions  of  the  vectors  o and  G are,  generally,  distinct.  For  example 
if  for  the  body  shown  in  Fig.  147  the  masses  of  the  sleeve  and  rod  are 
negligibly  small  compared  with  m,  the  moment  of  rotation  is,  by  (4), 
equal  to 

G = pr  x (cd  X r)  = p^2©  — (coprwr)  r 

which  means  the  vector  G is  in  rotation  with  angular  velocity  ©. 
This  change  in  the  moment  of  rotation  is  compensated  for,  by  virtue 
of  (11),  by  the  moment  transmitted  by  the  bush  onto  the  axis  of  ro- 
tation. 
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We  will  now  derive  another  formula  for  the  kinetic  energy  T of 
a rotating  body  (Q).  To  begin  with,  note  that  for  any  two  vectors  a, 
b it  will  be  true,  on  the  basis  of  a familiar  formula  for  the  area  of 

x\ 

a parallelogram,  that  | ax  b J = ab  sin  (a,  b).  On  the  other  hand, 
a • b = ab  cos  (a,  b)  whence 

|axb|2  + (a*  b)2  = a2b2[ sin2  (aj>)  + cos2  (Cb)]  = a2b2  (30) 

Form  the  scalar  product  of  (28)  by  co  and  take  advantage  of 
formula  (30)  to  get 

G • co  = C p[(r  • r)  (<o  • ©)  — (r  ■ ©)  (r  • © )]dS2  = f p | r x © |2  dQ.  = IT 

(«)  (n) 

whence,  by  (29), 

T = \ G ‘ 49  = j It}^i  (3  ]) 

The  tensor  (27)  possesses  the  important  property  of  symmetry: 

Iti  = I3i  (32) 

which  follows  immediately  from  the  definition  (26).  Generally  speak- 
ing, a tensor  of  rank  2 with  this  property  is  called  a symmetric 
tensor . It  is  easy  to  prove  here  that  if  the  property  (32)  holds  for  some 
one  basis,  then  it  holds  true  for  any  choice  of  basis. 

The  basic  property  of  a symmetric  tensor  (formula  (25)  of  Ch.  9) 
is  the  possibility  of  reducing  it  to  diagonal  form , that  is,  the  possibi- 
lity of  a choice  of  basis  'ei  for  which  all  quantities  pfj  vanish  when 
i j . We  will  not  give  the  proof  of  this  general  assertion  (which  holds 
true  for  symmetric  tensors  in  a space  of  any  number  of  dimensions), 
but  two  remarks  are  in  order.  First,  since  a diagonal  tensor  has  the 
property  of  symmetry  and  this  property  is  conserved  under  a change 
of  basis,  it  follows  that  only  symmetric  tensors  can  be  reduced  to 
diagonal  form.  Second,  since  in  the  choice  of  Euclidean  basis  in  three- 
dimensional  space  there  are  three  degrees  of  freedom  (why?),  and  to 
reduce  a tensor  to  diagonal  form,  the  three  equalities  p12  = p13  = p23  = 
= 0 must  be  satisfied,  there  turn  out  to  be  exactly  as  many  degrees 
of  freedom  as  are  needed. 

In  the  axes  xt  in  which  the  inertia  tensor  has  diagonal  form  (they 
are  called  the  principal  axes  of  that  tensor),  expressions  (29)  for  the 
projections  of  the  angular  momentum  and  (31)  for  the  kinetic  energy 
take  the  form 

Gj  = TqWj,  G2  = -^22C02»  £3  = ^33^3  > 
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Fig.  148 


In  order  to  get  a better  picture  of  the  dependence  of  the  kinetic 
energy  T on  the  direction  of  the  angular  momentum  G,  depict  on  a 
sphere  | G | = constant  rigidly  attached  to  the  body  (li)  certain  lines 
to  which  correspond  equal  values  of  T ; in  other  words,  these  are 
lines  with  equation  (33)  for  distinct  values  of  T (Fig.  148).  For  the 
sake  of  definiteness,  suppose  that  /u  < 722  < 1^.  Then  for  a given 
| G |,  the  greatest  value  of  T,  equal  to  [G|2,  is  obtained  when 

Q2  — G3  — 0 (Fig.  148) ; we  have  the  smallest  value  of  T for  Gl~G2  = 
— 0,  whereas  T has  a minimax  for  Gx  = G3  = 0.  In  free  rotation,  G 
and  T are  constant  but  co  varies  both  in  direction  and  in  modulus 
(except  for  rotation  about  one  of  the  #raxes  when  co||G||  ej.  The  body 
(Q)  rotates  so  that  the  vector  G always  passes  through  one  of  the 
indicated  lines  (Fig.  148  shows  the  translation  of  G with  respect  to  (£1)). 

Another  important  instance  of  a symmetric  tensor  is  the  elastic 
stress  tensor.  Suppose  we  are  considering  the  stressed  state  of  a rigid 
body  caused  by  forces  applied  to  it.  Conceive  of  a cube  with  side  h 
and  edges  parallel  to  the  coordinate  axes  isolated  in  the  medium  under 
study.  Then  the  elastic  action  of  the  ambient  medium  on  each  of  the 
faces  of  the  cube  can  be  replaced  by  a force  (Fig.  149)  which  is  propor- 
tional to  h 2 for  small  h : 


*** 

The  quantities  cr{i  depend  on  the  choice  of  direction  of  the  coordinate 
axes.  It  can  be  shown  that  under  a change  of  basis  they  transform  by 
the  tensor  rule  (23)  of  Ch.  9 and  for  this  reason  form  a second-rank 
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Fig.  149 


tensor,  which  is  called  the  elastic  stress  tensor  and  characterizes  the 
stressed  state  of  the  medium  at  the  point  in  question.  The  diagonal 
terms  of  this  tensor  define  the  stresses  of  compression  or  tension, 
and  the  off-diagonal  terms  define  the  shear  stresses. 

It  is  easy  to  verify  that  the  elastic  stress  tensor  is  a symmetric 
tensor.  Indeed,  the  total  moment  of  external  forces  acting  on  the  cube 
with  respect  to  its  centre  is  equal,  up  to  higher-order  infinitesimals, 

to  ^2  ^ X h3a.ijei  x fy  If  any  ^ <7jif  then  this  moment 

i - i,j 

would  be  nonzero,  which  is  impossible  since  the  moment  of  inertia 
of  the  cube  is  of  the  order  of  hb  and  we  would  have  an  infinite  angular 
acceleration. 

Use  is  also  made  of  antisymmetric  tensors  (p^)  for  which 

Pi j = Pit  (34) 

The  diagonal  terms  of  this  tensor  are  of  necessity  equal  to  zero.  It 
is  easy  to  verify  that  if  the  antisymmetry  property  (34)  is  fulfilled  for 
some  one  choice  of  basis,  then  it  holds  true  for  any  other  choice 
of  basis. 

An  antisymmetric  tensor  in  three-dimensional  space  is  directly 
connected  with  a certain  product.  Namely,  suppose  we  have  a linear 
mapping  of  space  into  itself  with  an  antisymmetric  matrix  of  the 
coefficients  jfy  (see  Sec.  9.5,  where  we  stated  that  any  second-rank 
tensor  can,  to  within  dimensionality,  be  interpreted  as  the  matrix 
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of  a certain  linear  mapping  of  the  space  into  itself).  Then  any  vector 
r = X&  is  mapped  via  the  formula 

Tr  = pijXfc  = pizX&i  + Pizx^i  Pizxx^2 

4"  p2SX2fiz  PlZXl^Z  p2,%x2pz 

But  we  get  the  same  result  (verify  this !)  if  we  form  the  vector  product 
of  p = — (^>12e3  + ^23ei  + Aie2)  and  r;  thus,  Tr  = pxr.  From  the 
last  equation  it  is  clear  that  the  vector  p is  defined  by  the  mapping  T, 
and  so  also  by  the  tensor  pijf  invariantly,  which  means  indepen- 
dently of  any  choice  of  a Cartesian  basis. 

In  conclusion  consider  the  tensor  of  a linear  mapping  that  is 
close  to  an  identity  mapping.  It  has  the  form  8if  + Y)fi,  where 
is  the  unit  tensor  and  are  small  coefficients.  Such  a tensor 
results,  for  example,  when  considering  small  deformations  of  an 
elastic  body. 

Represent  the  tensor  yj^  as  the  sum  of  a symmetric  tensor 
and  an  antisymmetric  tensor  yi}  where 

= j (thj  + f} a),  7ij  = j ha  ~ rw)  (35) 

Then  the  tensor  (3^  assumes  diagonal  form  in  certain  axes  Tc{  and  for 
this  reason  determines  a combination  of  small  uniform  tensions  along 
these  axes:  1 + f3n  times  along  the  nq-axis  (of  course  if  (3n  <0,  then 
we  have  a compression),  and  so  on.  Here  the  volume  will  increase 
(i  + M (i  + M (i  + P^-fold,  or,  to  within  higher-order  infinite- 
simals,  1 +_Pn  + P22  + P33  = 1 + times.  But  ft,  = (see  Sec. 
9.5)  and  so  = rtii  by  virtue  of  (35). 

The  tensor  defines  a small  rotation.  Indeed,  under  such  a 
rotation  about  the  vector  p°  through  a small  angle  each  vector  r, 
up  to  higher  infinitesimals,  goes  into  the  vector  r + <pp°  x r ; now 
we  have  demonstrated  above  how  one  can  choose  an  appropriate 
vector  p = ^p0  using  the  antisymmetric  tensor  yijt 

To  summarize : a linear  mapping  that  is  almost  an  identity  map- 
ping reduces  to  a combination  of  uniform  tensions  along  mutually 
perpendicular  axes  and  a rotation.  Since  the  volume  does  not  change 
under  a rotation,  the  sum  rni  (it  is  called  the  trace  of  the  tensor  y]tj) 
is  the  coefficient  of  the  increment  in  volume,  which  means  that  under 
the  given  mapping  the  volume  increases  1 + riH  times. 

It  is  to  be  stressed  that  the  superposition  of  deformations  leads 
to  an  addition  of  the  appropriate  tensors  only  for  small  deformations. 
Actually,  this  is  the  result  of  applying  the  formula  ( 1 + a)  ( 1 + (3) « 
« 1 + (a  + P)  for  small  a,  (3.  Large  deformations  (rotations  through 
a finite  angle,  for  instance)  are  superimposed  by  quite  a different  — 
generally,  noncommutative  — law,  which  we  will  not  discuss  here. 
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Exercises 

1.  Prove  the  theorem  of  the  possibility  of  reducing  a symmetric 

tensor  to  diagonal  form  for  the  two-dimensional  case. 

2.  Decompose  a small  shear  (Fig.  11  \d)  into  tensions  and  a rotation. 

11.6  True  vectors  and  pseudo  vectors 

There  is  a fundamental  difference  between  the  linear  velocity 
vector  and  the  angular  velocity  vector.  There  is  no  doubt  about  the 
direction  of  the  linear  velocity  vector.  By  contrast,  and  in  accord  with 
Sec.  11.2,  the  angular  velocity  vector  is  laid  off  along  the  axis  of 
rotation,  but  in  what  direction?  In  Sec.  11.2  we  chose  that  direction 
in  accordance  with  the  rule  of  the  right-handed  screw.  But  we  could 
just  as  easily  have  chosen  the  rule  of  the  left-handed  screw  and  then 
in  Fig.  137  the  vector  co  would  have  been  directed  downwards.  Thus, 
the  choice  of  direction  of  the  angular  velocity  vector  along  the  axis 
of  rotation  is  arbitrary  and  depends  on  the  chosen  rule  of  the  screw, 
if  the  rule  is  reversed,  the  vector  is  too.  Such  vectors  are  termed 
pseudovectors  to  distinguish  them  from  the  true  vectors  that  do  not 
depend  on  the  choice  of  the  rule  of  the  screw.  To  summarize:  the 
linear  velocity  vector  is  a true  vector  (just  like  the  vectors  of  force, 
acceleration,  electric  intensity,  and  so  forth),  while  the  angular  velo- 
city vector  is  a pseudovector.  In  another  terminological  classification, 
true  vectors  are  called  polar  vectors  and  pseudovectors  are  called  axial 
vectors.  * 

From  the  definition  of  a vector  product  it  is  clear  that  the  vector 
product  of  two  true  vectors  is  a pseudovector  because  under  a change 
in  the  rule  of  the  screw  the  earlier  outside  of  the  parallelogram  con- 
structed on  the  vectors  being  multiplied  becomes  the  inside  (see  Figs. 
1 3d,  135).  Thus,  the  moment  of  a force  and  the  angular  momentum 
(Sec.  11.2)  are  pseudovectors.  Similarly  it  can  readily  be  verified 
that  the  vector  product  of  a true  vector  by  a pseudovector  is  a true 
vector  (see  for  instance  formula  (5))  and  the  vector  product  of  two 
pseudovectors  is  a pseudovector. 

The  question  of  the  equivalence  of  the  right-handed  and  left- 
handed  coordinate  systems  is  not  at  all  so  simple  as  may  appear  at 
first  glance.  This  equivalence  signifies  that  for  any  phenomenon  we 
can  have  a mirror  reflection  of  it  under  which  all  geometric  forms  are 
the  mirror  images  of  the  originals,  just  like  a right-hand  glove  is  the 
mirror  image  of  the  left-hand  one.  (This  is  the  so-called  “law  of  con- 
servation of  parity”.)  Quite  recently  it  was  found  that  this  law  is 
not  universal,  and  the  celebrated  Soviet  theoretical  physicist 
L.  D.  Landau  (1908-1968)  proposed  the  “principle  of  combined 


When  considering  processes  that  occur  in  time,  there  arises  yet  another  classi- 
fication of  vectors  as  to  their  behaviour  under  a change  of  sign  of  t (compare, 
for  instance,  the  vectors  of  velocity  and  displacement). 
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parity”.  According  to  this  principle  all  physical  phenomena  admit 
reflections  only  if  all  particles  are  replaced  by  antiparticles.* 

Note  in  conclusion  that  the  vector  multiplication  of  two  vectors, 
under  which  the  projections  of  the  vector  product  are  expressed  in 
terms  of  the  projections  of  the  vector  factors  (yet  at  the  same  time 
the  product  is  invariant  under  a choice  of  coordinate  axes)  and  the 
ordinary  rules  of  multiplication  hold  true,  represents  an  operation  that 
is  characteristic  of  three-dimensional  space.  To  put  it  crudely,  the 
point  is  that  in  three-dimensional  space  we  can  agree  to  associate 
with  every  two  unit  vectors  (say,  i and  j along  the  axes  x and  y) 
a third  unit  vector  k along  the  z-axis  to  complete  the  set  of  three. 
In  this  way,  by  performing  cyclic  (circular)  permutations,  we  can  arrive 
at  the  formulas  (1)  of  the  vector  multiplication  of  the  unit  vectors. 
(True  because  if  ixj  = k,  then  we  must  have  jxk  = i since  the 
vector  k is  located  relative  to  the  vectors  i,  j in  exactly  the  same  way 
as  i is  relative  to  j and  k.)  From  the  latter  formulas  we  obtain  the 
formulas  of  vector  multiplication  of  any  vectors.  In  w-dimensional 
vector  space  (Sec.  9.6)  we  have  to  take  n — 1 unit  vectors  to  deter- 
mine the  nth  missing  vector.  Therefore,  in  ^-dimensional  space  an 
analogue  of  a vector  product  is  a vector  appropriately  constructed 
out  of  n — 1 vectors.  **  Thus,  the  vector  product  of  two  vectors  is 
peculiar  only  to  three-dimensional  space.  Naturally  we  have  not 
enumerated  all  the  conditions  that  are  necessary  for  determining 
a vector  product.  But  still  we  wanted  to  point  out  the  difference  be- 
tween a vector  product  and  a scalar  product,  which  is  defined  in  pre- 
cisely the  same  way  in  a space  of  any  number  of  dimensions. 

Exercises 

1.  Do  the  formulas  (1)  and  (3)  hold  true  in  a left-handed  system 
of  coordinates? 

2.  Construct  a reasonable  definition  of  a vector  product  in  four- 
dimensional space. 

1 1 .7  The  curl  of  a vector  field 

In  calculating  the  work  of  a force  field  in  Sec.  10.3  we  already 
arrived  at  the  concept  of  a line  integral.  We  now  consider  it  in  the 
general  form. 

* At  the  end  of  1964  new  experimental  findings  suggested  that  the  principle  of 
combined  parity  is  not  exact  and  is  sometimes  violated. 

**  The  analogue  of  a mixed  product  (p.  391)  is  made  up  for  n vectors.  This  pro- 
duct is  equal  to  the  w-volume  of  a "parallelepiped"  spanned  by  these  vectors. 
For  a plane — when  n = 2 — we  have  an  area,  which  is  a scalar  or,  to  be 
more  exact,  a pseudoscalar,  "pseudo”  meaning  "almost”.  What  we  mean  here 
is  that  the  quantity  does  not  change  under  rotations  of  the  coordinate  system 
but  is  multiplied  by  — 1 under  an  interchange  of  the  axes  x and  y.  An  instance 
of  a pseudoscalar  in  three-dimensional  space  was  the  mixed  product  of  three 
true  vectors  equal  to  the  volume:  the  sign  of  this  quantity  for  the  given  three 
vectors  depends  on  the  choice  of  a right-  or  left-handed  system. 
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Suppose  an  oriented  line  ( L ) is  chosen  in  a space  with  a specified 
field  of  vector  A.  Then  the  line  integral  of  A is  called  the  integral 

I = {AxdL  (36) 

(£) 

around  the  line  ( L ),  where  AT  is  the  projection  of  the  vector  A on 
the  tangent  to  ( L ) drawn  in  the  direction  of  traversal  (Fig.  150). 
Since  the  vector  dx  goes  along  t and  \dx\  = dL  (Sec.  9 A),  it  follows 
that  the  expression  for  the  line  integral  can  be  rewritten  thus: 

I = ( | A | cos  a | dr  | = C A • dx  = C [Az  dx  + Ay  dy  -f  Az  dz) 

(L)  (L)  (L) 

A line  integral  is  a scalar  quantity  and  has  the  ordinary  properties 
of  integrals.  If  the  orientation  of  (L)  is  reversed,  the  integral  only 
changes  sign.  If  the  angle  a (see  Fig.  150)  at  all  points  of  (L)  is  acute, 
then  I > 0,  and  if  the  angle  is  obtuse,  then  I < 0.  The  equality 
/ = 0 results  if  the  angle  a is  always  a right  angle  or  (this  happens 
more  often)  if  the  integrals  over  parts  of  (L)y  where  a is  acute  and 
a is  obtuse,  cancel  out. 

If  the  line  (L)  is  closed,  then  the  line  integral 

T = ^ A ■ dx  = ^ (Ax  dx  + Ay  dy  + dz)  (37) 

(L)  (L) 

is  called  the  circulation  of  the  vector  A around  the  line  (. L ).  The  circu- 
lation possesses  the  following  important  property  of  additivity. 
Suppose  an  oriented  open  surface  (5)  is  split  into  a number  of  parts, 
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say  three  — (5X),  (S2),  and  (S3)  — as  in  Fig.  151.  Denote  the  con- 
tours of  (S)  and  of  these  parts  by  (L),  (Lx),  (L2),  and  (Z3),  in  accor- 
dance with  the  orientation  of  (5),  and  the  corresponding  circulations 
by  r,  Tv  r2,  r3.  Then, 

r = Fa  -f-  T2  + T3 

Indeed,  if  all  circulations  on  the  right  are  represented  in  the  form  of 
a sum  of  line  integrals  around  the  separate  arcs  shown  in  Fig.  151, 
then  the  integrals  around  the  arcs  interior  to  (5)  will  all  cancel  out 
(since  each  such  arc  is  traversed  twice  in  opposite  directions),  and  the 
integrals  around  the  contour  arcs  of  (S)  are  additive  and  yield  the 
circulation  in  the  left  member. 

This  additivity  property  enables  us  to  say  that  the  circulation 
(37)  is  "generated”  on  the  surface  (5)  and,  hence,  to  speak  of  a "den- 
sity of  generation  of  the  circulation”,  that  is  to  say,  a circulation  gene- 
rated by  an  infinitesimal  piece  of  surface  and  referred  to  unit  area  of 
this  piece.  The  advisability  of  considering  such  a density  is  also  sug- 
gested by  the  following  circumstance.  It  is  easy  to  verify  that  the 
circulation  of  a constant  vector  is  always  zero: 

& (Cx  dx  + C2dy  + C3 dz)  = (CLx  + C2y  + Czz)  | 

(L) 

where  the  last  symbol  indicates  that  we  have  to  take  the  increment 
of  the  result  of  integration  when  the  point  traverses  the  contour  (L). 
But  after  such  a traversal  the  last  expression  in  brackets  returns  to 
its  original  value  and  so  the  increment  is  zero.  But  now  we  can  reason 
as  at  the  end  of  Sec.  10.7,  namely  that  by  virtue  of  Taylor's  formula 
the  vector  A inside  an  infinitesimal  contour  may  be  represented  as 
the  sum  of  a constant  vector  and  of  first-order  terms.  Then  an  almost 
complete  compensation  occurs:  the  integral  of  a constant  vector  is 
zero,  while  the  integral  of  first-order  terms  yields  a quantity  of  second 
order  of  smallness.  Thus,  the  circulation  around  an  infinitesimal  con- 
tour is  proportional  not  to  the  length  of  the  contour  but  to  the 
area  embraced  by  the  contour. 

To  compute  the  indicated  density  of  generation  of  the  circula- 
tion, compute  the  circulation  of  the  vector  A around  an  infinitely 
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small  contour.  First  assume  that  this  contour  lies  in  the  plane  ^-con- 
stant. Besides,  since  the  shape  of  the  contour  is  inessential  when  com- 
puting density,  for  this  contour  take  a rectangle  with  sides  parallel 
to  the  coordinate  axes  (see  Fig.  152,  where  the  size  of  the  rectangle 
is  somewhat  enlarged).  By  formula  (37)  the  appropriate  circulation  is 

= ^Ax  dx  + dy  + ^Ax  dx  + ^Ay  dy  (38) 

(1)  (2)  (3)  (4, 

(the  numerals  indicate  the  successive  sides  of  the  rectangle  in  Fig.  152), 
since  only  one  variable  on  each  side  varies,  while  the  other  differen- 
tials are  equal  to  zero.  Taking  into  account  the  sense  of  traversal  of 
the  indicated  sides,  we  get  from  (38) 

= (A  Ji  dx  + (Av)2  dy  — (Ax)3  dx  — ( A!f)4  dy 
= - (AJJ  dy  - [{Ax)3  - (AJJdx  (39) 

where  the  numerical  subscript  indicates  the  side  on  which  the  appro- 
priate projection  is  taken.  However,  to  within  higher-order  infinite- 
simals, 

- (A,)t  = ^-dx,  (Ax)3  - (A,),  = ^dy 

ox  oy 

and  soothe  formula  (39)  yields 

dV  = ^Ldxd V — d ydx  = 

dx  dy  1.  dx  dy  ) 

Thus,  for  an  infinitely  small  closed  contour, 

dYxy  dAx  mq\ 

dSXy  dx  dy 

(the  subscripts  in  the  left-hand  member  indicate  that  the  contour 
is  parallel  to  the  %y-plane).  Also,  the  contour  is  traversed  in  the  posi- 
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Fig.  153 


tive  sense,  otherwise  the  sign  has  to  be  reversed  or,  what  is  the  same 
thing,  we  have  to  regard  dSxy  < 0. 

Cartesian  coordinates  in  space  are  completely  equivalent  and  so 
any  formula  containing  these  coordinates  can  produce  any  other 
correct  formula  if  #,  y,  z are  replaced  respectively  by  y,  z,  x or  z,  x,  y. 
(This  — see  Sec.  11.6  — is  called  a cyclic , or  circular , permutation, 
under  which  a right-handed  system  of  coordinates  remains  right- 
handed.)  For  this  reason,  it  follows  from  (40)  that 

dTyz  dAz  ^Ay  ix  dA%  dAz  (A\\ 

dSyz  dy  dz  dSzx  dz  dx 

Now  let  us  consider  an  infinitesimal  oriented  area  ( dS ) with  an 
arbitrary  inclination  relative  to  the  coordinate  axes.  In  order  to  com- 
pute the  circulation  it  is  most  convenient  to  take  this  area  in  the  lorm 
of  a triangle,  as  in  Fig.  153.  On  this  triangle  construct  a tetrahedron 
with  faces  parallel  to  the  coordinate  planes  and  label  the  vertices  of 
the  tetrahedron  with  numbers  as  indicated  in  Fig.  153.  Then  it  is 
easy  to  see  that 

dT  — dV12z  — dY  12 4 T-  ^234  “1“  ^^431 

since  on  the  right  side  the  integrals  around  the  segments  41,  42, 
and  43  cancel  out.  But  the  right  side  can  be  computed  by  formulas. 
(40)  and  (41): 


where  the  numerical  subscripts  indicate  the  areas  in  question. 


(42). 
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The  result  becomes  more  surveyable  if  we  introduce  via  the  fol- 
lowing formula  a vector  called  the  curl  (or  rotation)  of  the  field  A and 
denoted  by  curl  A: 


curl  A = 

[ By 


1 1 i I9AZ  SAZ 1 . I BAy  3Ag  ) , 

Bz  j + l Bz  3x  j J ( dx  By  )k 


If  we  also  note  that  by  virtue  of  the  formula  (18)  of  Ch.  10, 


(43) 


*^234^  + ^*^431  j T~  — - ^8234  T“  ^*431  ^S|24  — — H dS 

and  if  we  divide  by  dS,  then  (42)  can  be  rewritten  more  simply  thus: 


= (curl  A)  * n = curl„  A 


(44) 


Here  the  subscript  n indicates  that  we  take  the  projection  of  the  curl 
on  the  normal  n.  This  formula  gives  the  circulation  around  an  infini- 
tesimal contour  referred  to  the  unit  area  embraced  by  this  contour.  * 

Thus,  the  projection  of  the  curl  of  the  field  on  any  direction  n 
is  equal  to  the  ratio  of  the  circulation  of  the  field  around  an  infini- 
tesimal contour  perpendicular  to  n to  the  area  embraced  by  this  con- 
tour. From  this  it  is  evident  for  one  thing  that  the  curl,  whose  defi- 
nition (43)  is  attached  to  a chosen  coordinate  system,  actually  is 
invariantly  connected  with  the  field  (it  forms  a new  vector  field  since 
at  every  point  of  space,  generally,  it  has  its  own  value) : it  does  not 
depend  on  the  choice  of  coordinate  system  since  the  left-hand  side 
of  (44)  does  not  depend  on  this  choice,  and  a knowledge  of  the  pro- 
jection of  the  vector  on  every  direction  determines  this  vector  in 
unique  fashion.  At  the  same  time  the  curl  of  a true  vector  is  a pseudo- 
vector (see  Sec.  11.6),  since  under  a change  of  the  rule  of  the  screw 
for  the  same  orientation  of  the  area  ( dS ),  i.e.  with  the  same  vector  n, 
the  traversal  of  its  contour  is  reversed  and  so  the  circulation  changes 
sign.  Note,  incidentally,  that  obtaining  anew  vector  field  by  determin- 
ing the  curl  of  another  vector  field  is  a specific  feature  of  three-dimen- 
sional space.  Yet  obtaining  a vector  field  as  a gradient  of  a scalar 
field  in  space  of  any  number  of  dimensions  occurs  in  the  same  way. 
The  relationship  here  is  the  same  as  between  a vector  product  and  a 
scalar  product  (see  the  end  of  Sec.  11.6). 

Now  suppose  instead  of  an  infinitesimal  surface  we  have  in  space 
a finite  oriented  surface  (5)  with  contour  ( L ).  Wc  have  already  seen 


From  this  it  follows  that  the  direction  of  the  vector  curl  A in  space  is  deter- 

dr 

mined  by  the  direction  of  the  normal  to  an  area  for  which  — is  a maxi- 

dS 


mum.  This  definition  is  similar  to  the  definition  of  the  direction  of  the  gradient 

dtp 

of  a scalar  <p  as  the  direction  of  a line  (/),  on  which  direction  — attains 


a maximum. 
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that  the  circulations  corresponding  to  separate  parts  of  the  surface 
are  additive,  i.e.  the  total  circulation  is 

r = ^r 

(S) 

And  so  from  (37)  and  (44)  it  follows  that 


• dr  = ^ (curl„  A)  dS  = ^ curl  A * dS  (45) 

(L)  (5)  (S) 

that  is,  the  circulation  of  a field  around  a closed  contour  is  equal  to 
the  flux  (see  Sec.  10.7)  of  the  curl  of  this  field  through  the  surface 
bounded  by  the  indicated  contour.  This  important  formula  is  called 
Stokes  theorem. 

There  is  another  useful  integral  formula  involving  the  curl  that 
transforms  the  integral 

I = ^ A X dS 
( s ) 

over  a closed  surface  (S)  (oriented  in  natural  fashion,  i.e.  with  outside 
pointing  to  infinity)  into  an  integral  over  the  volume  (Q)  bounded 
by  (5).  To  derive  it,  note  that 


Ix  = I-i  = k(AxdS)-i  = C(ixA)-rfS 

(S)  (S) 

Here  we  make  use  of  the  fact  that  by  virtue  of  the  geometric  meaning 
of  a scalar  triple  product  this  product  does  not  change  under  a cyclic 
permutation  of  the  factors.  However, 


ix  A = 


i j k 
1 0 0 
Ax  Ay  Az 


—Agj  + Ayk 


or 


(S) 

Transforming  this  integral  by  the  Ostrogradsky  formula  (Sec.  10.7), 
we  get 

<fl)  <n) 1 y 


See.  11.8 
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In  similar  fashion  we  find 


whence 


^AxdS  = I = 7^  + /J  +/,k  = J[(— ^ + - 


+— 


L)j  + ( 


k Ufi 


-J 


curl  A dQ. 


Exercises 

1.  Use  the  last  formula  to  obtain  an  invariant  definition  (not  relat- 
ed to  any  choice  of  coordinate  system)  of  the  curl  similar  to 
the  definitions  of  divergence  (formula  (33)  of  Ch.  10)  and  gra- 
dient (formula  (57)  of  Ch.  10). 

2.  Use  the  Stokes  theorem  to  prove  the  Cauchy  theorem  on  the 
integral  of  an  analytic  function  (Sec.  5.8). 

Hint.  Pass  to  real  integrals  and  take  advantage  of  the  Cauchy- 
Riemann  conditions  (17)  of  Ch.  5. 

11,8.  The  Hamiltonian  operator  del 

Let  us  write  down  the  basic  differential  operations  that  can  be 
performed  on  a scalar  field  u and  a vector  field  A = Axi  + Ay]  -f- 
+ Azk: 

■,  du  . . dit  . . cu  « 

grad  u = — i + — j + — k, 
ox  ay  oz 


div  A = 


dAx  . dAy  dA  z 
dx  dy  dz 


curl  A = i + (djT-^r)  j + (T^  + T^l  k 

f dy  dz  ) \ dz  dx  ) t dx  dy  ] 

The  Irish  mathematician  William  R.  Hamilton  noticed  that 
these  three  operations  can  be  written  more  simply  if  one  introduces 
the  symbol 


called  del.  Taken  by  itself,  this  symbol  is  an  operation  sign,  that  is, 
an  operator.  It  is  a vector  differential  operator  that  preserves  the 
features  of  a vector  and  of  a differentiation  operator.  (The  general 
notion  of  an  operator  is  discussed  in  Sec.  6.2.) 
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“Multiplication”  (which  is  an  operation)  of  the  del  operator  by 
a scalar  (by  a scalar  field,  to  be  more  exact)  u and  by  a vector  A takes 
place  by  the  following  natural  rules: 

rt  ■ Btc  - • 1 i Bu  j 

V«=1  — + J—  + k—  W = 1—  +J  — + k—  = grad  tt, 

\ Bx  By  Bz  ) ox  By  dz 

V-A  = [i±+j±+k±y(iAx  + }Al,  + kAt) 


3AX 


dx 


3A„  , 3AZ  = dh.  A> 


By 


dz 


i j k 

3 3 3 

3AZ  3Ay\ 

Bx  By  Bz 

Ax  Ay  Az 

|| 

By  Bz  J 

| -f  etc.  = curl  A 


VXA 


Del,  being  a differential  operator,  operates  only  on  the  factor 
that  stands  immediately  to  the  right  of  it:  for  example, 

( Vu ) v = (grad  u)  v — v grad  uy  \(uv)  = grad  (uv) 

Therefore  if  there  is  no  factor  following  del,  then  it  is  an  operator: 
for  example, 

A.V  = (iA*  + jAy  + kAz)-(i-l+j  ^ + k-^) 

= AX—  + Ay—  + Az  — 

xBx  y By  z Bz 

is  a scalar  differential  operator  which  can  operate  on  a scalar  field 
or  a vector  field.  It  finds  use,  in  particular,  in  the  formula  for  the 
rate  of  change  of  a field  along  a trajectory  (Sec.  4.1) : 


du  Bu  . Bu  , Bu  , Bu  , . . Bit 

= v»—  +Vv—  + Vt—  + — = (v-v)«  + — 


dt 


dx 


Bz 


Bt 


Bt 


(46) 


Repeating  the  derivation  of  this  formula,  we  can  readily  verify  that 
there  is  an  analogous  formula  for  the  rate  of  change  of  a vector  field 
along  a trajectory: 

— = (v-v)  A + — 

dt  Bt 

The  separate  terms  here  have  the  same  meaning  as  in  (46).  This 
enables  us,  for  one  thing,  to  rewrite  the  fluid-flow  equation  (54)  of 
Ch.  10  in  the  form 

p(v  • V)  v + p = —grad  p + f 

dt 

from  which  it  is  now  quite  easy  to  pass  to  the  coordinate  form. 
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When  operating  with  the  del  operator,  one  uses  the  rules  of  vector 
algebra  and  the  rules  of  differentiation.  For  instance, 

curl  (A  + B)  = vx(A  + B)  = vxA  + vxB 
= curl  A + curl  B, 

div  (XA)  = v * (AA)  = X(v  • A)  = X div  A (X  — constant)  (47) 

since  multiplication  by  a vector  and  also  differentiation  have  these 
properties  of  linearity.  At  the  same  time,  we  cannot  regard  X in 
(47)  as  depending  on  a point  in  space  (that  is,  as  being  a scalar  field), 
for  then  it  would  mean  that  we  had  taken  a variable  quantity  out- 
side the  sign  of  differentiation.  In  order  to  embrace  this  case,  note 
that  in  the  ordinary  formula  for  the  derivative  of  a product, 

( uv )'  = u'v  + uv'  (48) 

we  get  the  first  term  if  in  the  process  of  differentiation  we  take  v to 
be  constant,  and  the  second  term  if  in  this  process  we  take  u to  be 
constant,  and  so  the  differentiation  (48)  can  be  carried  out  as  follows: 

(uv)'  = (ucv) ' + ( uvc )'  = ucv'  + u've  = uv'  + u'v 

where  the  subscript  c indicates  that  in  differentiating  we  regard  the 
given  quantity  as  a constant  (if  of  course  the  quantity  stands  outside 
the  differentiation  sign,  then  the  subscript  c can  be  dropped).  Thus, 

div  (uA)  — v * (uA)  — v * (wcA)  + V * (^Ac) 

= u(\  • A)  + (yw)  - A — u div  A + grad  u • A 

(this  formula  was  derived  in  a different  way  in  Sec.  10.8). 

After  applying  the  differential  operation  to  a field  we  get  a new 
field,  to  which  these  operations  can  again  be  applied.  To  illustrate, 
consider  the  “compound”  operation  curl  grad  u.  We  can  write  it  in 
the  form  yx(yu).  But  for  an  “ordinary”  vector  a and  an  “ordi- 
nary” scalar  u it  is  always  true  that 

ax(aw)  = 0 (49) 

(why?).  This  means  that  if  for  a on  the  left  we  substitute  its  expan- 
sion along  Cartesian  axes  and  carry  out  the  computations  by  the 
formal  rules  of  vector  algebra,  we  get  zero.  But  the  computation  of 

the  combination  \ x (\u)  is  carried  out  by  the  same  formal  rules 

d d d 

as  in  (49),  only  instead  of  ax,a„,  at  we  have  to  take — * — . — • This 

“ dx  dy  Bz 

means  that  we  again  get  zero ; that  is,  it  is  always  true  that 

curl  grad  u = 0 (50) 

Similarly  we  find  that  it  is  always  true  that 
div  curl  A = 0 


(51) 
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(verify  this  !).  This  simple  property  has  an  important  consequence. 
Namely,  for  any  field  A we  can,  besides  vector  lines  (Sec.  10.7), 
consider  vortex  lines , that  is  to  say,  vector  lines  of  the  field  curl  A. 
Formula  (51)  states  that  the  vortex  lines  cannot  have  either  sources 
or  sinks,  which  means  they  cannot  originate  or  terminate. 

Finally,  the  combination 

div  grad  = V V = V2  = — + — + — 

b V V v dx2dy2  dz2 

(the  Laplacian  operator  A)  was  considered  in  Sec.  10.9. 

Exercises 

1.  Derive  formulas  for  curl  (wA),  div  (AxB). 

2.  On  the  basis  of  formulas  (33)  and  (57)  of  Ch.  10  and  the  solution 
of  Exercise  1,  Sec.  11.7,  write  the  symbolic  expression  \ in 
the  form  of  an  integral. 

1 1.9  Potential  fields 

In  Sec.  10.3  we  considered  the  problem  of  the  existence  of  a 
potential  (potential  energy)  in  the  case  of  field  of  force.  We  now 
consider  this  question  in  the  general  form.  A vector  field  A is  said  to 
be  a potential  field  if  it  is  the  gradient  of  some  scalar  field;  if  we 
denote  this  field  by  — 9,  then 

A = —grad  9 (52) 

(cf.  formula  (9)  of  Ch.  10).  Here  the  field  9 is  called  the  potential  of 
the  field  A. 

Since  the  gradient  of  a constant  scalar  field  is  equal  to  zero,  the 
potential  of  any  field  A (if  the  potential  exists)  is  determined  to 
within  an  arbitrary  additive  constant.  By  apt  choice  of  this  con- 
stant, it  is  possible  to  make  the  value  of  the  potential  equal  to  zero 
at  any  specified  point.  Customarily,  the  value  of  the  potential  at 
infinity  is  taken  to  be  zero.  The  difference  in  the  values  of  the  poten- 
tial at  two  points  no  longer  depends  on  this  slight  arbitrariness  in 
the  choice  of  the  potential,  since  the  constant  additives  in  such  a 
difference  cancel  out. 

Not  every  field  by  far  is  a potential  field,  namely,  from  (52)  it 
immediately  follows  that 

curl  A — —curl  grad  9 = 0 (53) 

(see  formula  (50)),  that  is,  a potential  field  is  necessarily  an  irrotatio- 
nal  field.  Taking  into  account  expression  (43)  for  the  curl,  the  con- 
dition (53)  can  be  rewritten  as 

dAz  dAy  dAx  dAz  dAy  dAx 


dy  dz 


dz 


dx 


dx  dy 


(54) 
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Conditions  (54)  can  also  be  derived  in  a different  way.  In  coordi- 
nate form,  (52)  is 


whence  it  follows  that 


dx  y dy 


But  then 


dAz  d2(p 

dy  dz  dy 


dAjL 

dz 


^9 

dz 


a29 
dy  dz 


and  from  the  equation  of  mixed  partial  derivatives  (Sec.  4.1),  we  get 
the  first  relation  of  (54).  The  others  are  derived  in  similar  fashion. 

Conversely,  suppose  a field  specified  throughout  the  space  be 
irrotational.  * Then  this  field  is  of  necessity  noncirculatory,  i.e. 


A • dx  = 0 


(L) 


for  any  closed  contour  ( L ).  Indeed,  span  the  contour  ( L ) with  a surface 
(S).  Applying  the  Stokes  theorem  (18)  to  (5),  we  find  that  if  curl 
A = 0,  then 

& A • dr  = ( curl  A • rfS  = 0 

(L)  (S) 


In  other  words,  for  an  irrotational  field  the  circulation  around  any 
infinitely  small  contour  is  zero,  and  so  the  circulation  around  any 
finite  contour  is  also  equal  to  zero. 

In  Sec.  10.3  we  show  that  a noncirculatory  field  of  necessity 
has  a potential,  which  is  constructed  via  the  formula  (see  (8)  of  Ch.  10) 

<p(M)  = J A-*  (55) 

MM0 

where  M0  is  an  arbitrarily  fixed  point  and  the  choice  of  the  path  of 
integration  is  immaterial,  since  the  line  integral  in  a noncircula- 
tory field  depends  solely  on  the  position  of  the  beginning  and  end 
of  the  path  of  integration.  True,  in  Sec.  10.3  we  spoke  of  a force 
field  and  interpreted  the  potential  as  work,  but  from  the  mathemati- 
cal point  of  view  such  a concrete  interpretation  is  inessential  since 
we  can  speak  about  any  vector  field  and  its  potential. 


If  the  field  is  considered  only  in  a portion  of  the  whole  space,  there  may  be 
certain  complications  that  will  be  discussed  in  Sec.  11.13. 


428  Vector  product  and  rotation  CH.  11 

To  summarize,  when  considering  a vector  field  throughout  a 
space,  the  requirements  that  this  field  be  potential,  irrotational  or 
noncirculatory  are  completely  equivalent  so  that  fulfilment  of  one 
of  these  requirements  implies  the  fulfilment  of  all  the  others.  It  again 
follows  from  formula  (55),  by  virtue  of  the  arbitrary  character  of 
the  point  M0,  that  the  potential  is  determined  to  within  an  additive 
constant. 

If  the  vector  field  A is  not  only  irrotational  but  is  devoid  of 
sources  of  vector  lines  (see  Sec.  10.7) ; that  is,  if 

curl  A = 0 and  div  A = 0 

then  from  the  former  it  follows  that  A = — grad  9 and  from  the  latter, 
that 

div  grad  9 = —div  A = 0,  or  y29  = 0 

Thus,  in  this  case  the  potential  of  the  field  must  satisfy  the  Laplace 
equation  (which  was  examined  in  Sec.  10.9)  or,  to  put  it  differently, 
it  must  be  a harmonic  function. 

If  an  irrotational  field  considered  throughout  a space  is  without 
sources  not  only  at  a finite  distance  but  also  at  infinity,  then  the 
vector  of  such  a field  is  simply  identically  zero,  that  is,  there  is  ac- 
tually no  field.  This  is  associated  with  the  following  property  of  har- 
monic functions:  a function  that  is  harmonic  throughout  a space 
and  is  equal  to  zero  at  infinity  is  identically  zero.  A still  more  general 
fact  can  be  proved,  though  we  will  not  do  so:  a function  that  is 
harmonic  throughout  a space  and  is  bounded  at  infinity  is  identically 
equal  to  a constant,  which  means  that  such  a potential  is  also  asso- 
ciated with  a zero  field.  Thus,  so  that  the  theory  can  have  content, 
we  must  either  allow  for  field  sources  at  infinity,  that  is,  the  unlimi- 
tedness of  a potential  at  infinity,  or  allow  for  sources  at  a finite  dis- 
tance, which  means  not  assuming  the  potential  to  be  harmonic  through- 
out the  whole  space.  For  example,  to  a field  A — constant  there 
corresponds  a potential  9 = —A*  r (check  this!)  that  is  unbounded 
at  infinity,  while  to  the  Newtonian  field  (Sec.  10.3)  there  corresponds 
a potential  that  is  everywhere  harmonic  except  at  one  point.* 

Let  us  return  to  the  force  field  F and  use  the  letter  A to  denote 
the  work,  as  in  Sec.  10.3.  This  force  can  have  a potential  but  it 
can  also  fail  to  correspond  to  any  potential.  This  latter  fact  can  be 
determined  in  two  ways:  either  by  verifying  that  curl  F 0,  that 
is,  that  at  least  one  of  the  conditions 

dFz  _ dFy  dFX  _ dFz  dFy  _ dFX 

dy  dz  dz  dx  dx  dy 


At  this  point  A 9 is  infinite,  proportional  to  the  delta  function,  so  that  although 
the  point  is  "small"  compared  with  an  infinite  volume,  this  point  must  not  be 
forgotten. 
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(see  the  conditions  (54))  is  not  fulfilled,  or  by  verifying  that  the  field 
is  not  noncirculatory,  which  is  to  say  that  at  least  for  one  closed 
contour  the  work  of  the  force  F is  different  from  zero. 

Take  the  following  example.  Suppose  a force  F lies  in  the  #y-plane, 
i.e.,  it  forms  a plane  field  and  at  each  point  is  perpendicular  to  the 
straight  line  joining  this  point  with  the  coordinate  origin  (it  is  perpen- 
dicular to  the  radius  vector)  and  is  directed  counterclockwise.  Now 
suppose  the  magnitude  of  the  force  is  proportional  to  the  distance  r 
of  the  point  to  the  origin,  that  is,  | F | = ar , where  a is  a constant  of 
proportionality.  (Fig.  154  shows  such  a force  for  several  points  of  the 
plane.)  Suppose  a body  is  in  motion  about  a circle  of  radius  R under 
the  action  of  such  a force  (Fig.  155).  A very  small  path  from  ilftoiV 
(this  corresponds  to  a rotation  through  the  angle  rfcp)  can  be  taken 
approximately  to  be  rectilinear.  It  is  equal  to  R dy,  as  the  arc  length 
of  a circle  corresponding  to  the  central  angle  dy.  The  force  is  directed 
along  the  path  and  so  the  work  performed  on  the  section  MN  is 

dA  = R d<p  • | F | = R dy  aR  = aR2  dy 
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Hence  if  the  point  traverses  the  entire  circle,  the  work  done  is 

2tt  2n 

A = ^ aR 2 dtp  ~ aR 2 • ^ d<p  = nR2  • 2 a 
0 0 

The  work  is  proportional  to  the  area  of  the  circle  and  is  by  no  means 
zero.  (It  can  be  proved  that  for  the  force  at  hand  the  work  done  in 
traversing  a closed  curve  is  proportional  to  the  area  bounded  by  the 
curve  of  motion  for  any  shape  whatsoever  of  that  curve.)  It  is  clear 
that  such  a force  cannot  correspond  to  any  potential. 

The  last  conclusion  can  also  be  done  by  formally  computing 
the  curl.  Since  the  field  of  force  that  results  in  this  case  is  the  same 
as  the  field  of  linear  velocities  in  the  case  of  a rotation  about  the  z-axis 
with  angular  velocity  a,  we  can  take  advantage  of  formula  (5)  to  get 

i j k 

F = {ak)  xr  = 0 0 a = — ayi  + ax j (56) 

x y 0 

(this  can  easily  be  obtained  directly  from  Fig.  154).  From  this  we 
have 

i J k 

curl F - vxF  = — — — — ak  ak  — 2ak  =k  0 

dx  dy  dz  ^ 

— ay  ax  0 

Thus,  the  curl  is  not  equal  to  zero  and  so  the  field  is  not  a potential 
field. 

Exercise 

Prove  that  the  field 
A = 2xzi  + y2 j + x2k 

is  a potential  field  and  construct  its  potential. 

11.10  The  curl  of  a velocity  field 

The  concept  of  a curl  is  especially  pictorial  when  considering  a 
field  of  linear  velocities  v of  particles  of  a continuous  medium.  Let  us 
examine  some  examples. 

Let  the  medium  be  in  translational  motion  like  a rigid  body. 
Then  v = constant  and  since  the  curl  is  expressed  with  the  aid  of 
operations  of  differentiation,  it  follows  that  curl  v = 0.  Here,  there 
is  no  circulation  of  velocity. 

Suppose  the  medium  is  rotating  about  an  axis  with  angular  velo- 
city co  like  a rigid  body.  We  take  advantage  of  the  fact  that  the  curl 
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is  invariantly  connected  with  the  field  and,  hence,  when  calculat- 
ing the  curl  we  can  arrange  the  coordinate  axes  in  any  way  we  wish 
for  the  sake  of  convenience.  And  so  we  send  the  £-axis  along  the  axis 
of  rotation.  Then  computations  similar  to  those  carried  out  at  the 
md  of  Sec.  11.9  yield 

v — — toyi  + 

curl  v = 2cok  = 2co 

Hence,  the  curl  here  is  constant  throughout  the  space  and  is  equal  to 
twice  the  vector  of  angular  velocity.  For  this  reason,  the  circulation 
of  linear  velocity  around  a small  contour  is  a maximum  if  this  contour 
is  perpendicular  to  the  axis  of  rotation  (then  the  curl  is  projected  on 
the  normal  to  the  corresponding  small  area  in  full  length)  and  if  the 
plane  of  the  contour  is  parallel  to  the  axis  of  rotation,  then  the  circu- 
lation is  zero. 

It  can  be  demonstrated  that  any  motion  of  a rigid  body  at  every 
instant  of  time  is  the  result  of  a superposition  of  translational  and 
rotational  motions.  By  virtue  of  the  preceding  two  paragraphs,  at 
every  instant  the  curl  of  the  linear  velocity  of  a rigid  body  is  the  same 
at  all  points  of  the  body  and  is  equal  to  twice  the  instantaneous  vector 
of  the  angular  velocity. 

Now  let  us  take  up  the  motion  of  a medium  in  which  the  distances 
between  its  points  vary.  Suppose  we  have  a “pure”  compression  of 
gas  by  a piston  along  the  #-axis  to  the  plane  x = 0 ; then  the  velocity 
field  is  of  the  form 

v — —Xxi 

where  X is  the  proportionality  constant.  Calculating  the  curl,  we  get 

i j k 

curl  v = — — — = 0 

dx  8y  dz 

—\x  0 0 

Cauchy  proved  that  any  motion  of  a small  volume  of  a continuous 
medium  under  strain  (of  a fluid  or  a solid  under  strain)  is  at  any  time 
the  result  of  a superposition  of  translational  and  rotational  motions 
and  also  of  “pure”  compressions  and  tensions.  (Note  that  small  por- 
tions of  an  incompressible  liquid  may  experience  simultaneously 
“pure”  compressions  and  tensions  in  different  directions.)  Since  a 
nonzero  curl  is  obtained  only  for  rotational  motion,  we  see  that  for 
an  arbitrary  motion  of  a medium  the  curl  of  the  field  of  linear  velocity  v 
of  particles  is  equal  at  each  point  to  twice  the  angular  velocity  vector 
of  the  corresponding  particle.  True,  in  the  general  case,  the  curl  at 
different  points  is  different.  Thus,  in  the  case  of  fluid  flow  the  fact 
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that  the  curl  of  the  field  of  linear  velocity  is  different  from  zero  indi- 
cates the  presence  of  a vorticity,  a rotation,  whence  the  term  rotation 
(curl) . 

Now  let  us  examine  the  “shear”  flow  represented  by  arrows 
in  Fig.  156.  This  is  what  happens  in  the  flow  of  a viscous  fluid 
along  a solid  wall.  Here,  v = ayi,  where  a is  the  proportionality 
factor,  whence  curl  v = — ak  (check  this!).  This  is  a vortex-type 
flow  and  every  particle  of  the  fluid  rotates  clockwise  with  angular 
velocity  aj 2. 

Finally,  let  us  examine  a simple  “vortex  line”,  i.e.  a plane- 
parallel  field  defined  in  the  xy- plane  by  the  equation 

V = (pk)  X - = ( pk ) X - 
y t 2 

This  field,  which  is  shown  in  Fig.  157,  is  somewhat  reminiscent  of  the 
field  in  Fig.  154,  but  here  the  modulus  of  the  field  vector  is  inversely 
(not  directly)  proportional  to  the  distance  from  the  origin.  Here  the 
z-axis  is  the  axis  of  the  vortex  line. 
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As  in  (56),  let  us  find  the  expression  of  velocity  in  coordinate 

1 1 Min : 


v 


py 


■i  + 


P* 

x 2 + y 2 


j 


(57) 


whence  it  is  easy  to  calculate  directly  that  curl  v = 0 off  the  --axis 
(cheek  this!).  Thus,  each  particle  of  the  fluid  traverses  the  vortex 
line,  undergoing  deformation  without  performing  rotational  motion 
about  its  own  axis.  At  the  same  time,  since  the  circulation  of  vector  v 
around  any  circle  with  centre  at  the  origin  is 

— • 2izr  = 2 up 
r 


it  follows  that  at  the  origin  there  is  a “source  of  circulation”  whose 
density  in  the  #y-plane  is  equal  to 


^ = 2^«(r) 

ao 


(where  8 is  the  delta  function,*  see  Sec.  6.3).  Since  the  curl  of  a 
plane-parallel  field  is  perpendicular  to  the  plane  of  the  field,  in  this 
example  we  get 

curl  v — 2np8(xi  + yj)  k = 2np8(x)  8(y)  k (58) 

(regarding  the  last  representation,  see  Sec.  6.3). 

Exercise 


Find  the  directions  in  a particle  of  a vortex  line  that  remain 
invariant  under  an  infinitesimal  displacement. 

Hint.  Proceed  from  the  fact  that  the  vector  dr  must  not  turn. 


11.11  Magnetic  field  and  electric  current 

The  concepts  of  vector  product  and  curl  are  extensively  employed 
in  discussions  of  magnetic  fields.  For  the  sake  of  simplicity  we  consider 
the  fields  in  a vacuum. 

A magnetic  field  is  completely  characterized  at  every  point  of 
space  by  the  “magnetic-field  intensity”  vector  H.  This  vector  is  lar- 
gely analogous  to  the  vector  E of  electric-field  intensity  (Secs.  10.5 
and  10.9)  but  does  not  act  on  fixed  charges,  like  E,  but  on  permanent 
magnets  or  (what  will  soon  be  seen  to  be  the  same  thing)  on  moving 
charges.  A magnetic  field  can  also  be  established  by  permanent  mag- 
nets or  moving  charges. 


Here  we  have  a two-dimensional  delta  function:  8(r)  = &(*)  8fy)  since  the 
integration  is  over  a plane,  r = f curl  A • dS. 


28-1634 
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Let  us  first  consider  the  simplest  scheme.  Suppose  a magnetic 
field  is  established  by  a current  J flowing  along  an  infinite  (practically 
speaking,  a very  long)  rectilinear  conductor  which  we  assume  to  be 
coincident  with  the  2-axis.  Clearly,  we  then  have  a plane-parallel  field, 
which  it  suffices  to  consider  solely  in  the  ^y-plane.  Experiment  tells 
us  that  the  vector  H is  then  obtained  as  indicated  in  Fig.  158 ; it  lies 
in  the  #y-plane  and  is  perpendicular  to  the  radius  vector  r,  the  direc- 
tion of  H being  determined  by  the  rule  of  the  right-handed  screw. 
The  intensity  is  directly  proportional  to  the  current  and  inversely 
proportional  to  the  distance  of  the  point  from  the  conductor,  i.  e. 

H = a — , where  a is  a proportionality  factor ; for  a certain  choice 

r 

of  units  of  H and  J,  it  turns  out  equal  to  2 jc,  where  c is  the  velocity 
of  light,  so  that  H = — . 

cr 

The  field  H under  consideration  has  precisely  the  form  shown 
in  Fig.  157,  which  is  to  say  that  it  has  the  formula  (57),  where  for  p 

we  have  to  substitute  — . From  this,  by  formula  (58),  we  get 

c 

curl  H = In  — &(#)  8(y)  k 

c 

On  the  other  hand,  the  product  J8(x)  B(y)  k is  the  current  density 
shown  in  Fig.  158.  We  will  denote  the  current  density  by  j (do  not 
confuse  this  with  the  unit  vector  of  the  r-axis !).  In  this  example  we 
then  have 

curl  H = — j (59) 

By  this  formula,  j is  expressed  in  terms  of  H.  We  can  obtain  the 
inverse,  in  which  H is  expressed  in  terms  of  j.  To  do  this  we  introduce 
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tlu:  “vector  potential”  A of  the  field  H by  analogy  with  the  scalar 
potential  of  an  electric  field  (see  Sec.  10.5), 


(60) 


I n the  case  at  hand  this  potential  is  equal  to 


Aw  = i( 


J k< 

]/  x1  + y2  + (C  — z)2 


l t « 

C J yx2  + y2  + £-Z)* 

— OO 


Although  the  last  integral  diverges,  we  can  differentiate  it  with  respect 
to  parameters  (cf.  Sec.  10.6),  whence,  in  particular, 


curl  A = 


1 

_a_ 

dx 


J 

a_ 

dy 

0 


=44.  j 


k 

dz 

J 


3 


dZ 


c[dy  J y x2  + y2  + (Z  — *)2 


OO 

-±[ 

dx  J 


dZ 


yX2  + y2  + ^ 


= - 1 J [*2  + y2  + (e  - dl  ■ (yi  - xj) 

— OO 

This  integral  can  easily  be  evaluated  by  the  substitution  Z ~ z — 
= y x2  + y “ tan  5 and  we  get  (check  this  !) 

curl  A = - -•  2 - (yi  - xj) 

c x2  -f  y2 

2 r 

We  have  again  arrived  at  the  formula  (57)  with  p = — , i.e.  at  H: 

c 

H = curl  A = — curl  [ — J'^  - dQ0  (61) 

' C J I r — r0 1 

When  considering  systems  of  infinite  rectilinear  conductors, 
their  magnetic  fields  H and  also  the  current  densities  are  additive, 
which  is  to  say  that  (59)  and  (61)  hold  true  all  the  same.  We  will  also 
assume  that  they  are  valid  for  any  stationary,  not  necessarily  recti- 
linear, distribution  of  currents  in  space.  (This  is  rather  clear  for  for- 
mula (59)  since  it  connects  the  values  of  H and  j in  a single  point, 
and  formula  (61)  can  be  derived  from  (59),  but  we  will  not  go  into  that 
here.)  This  assumption  is  justified  by  the  fact  that  the  consequences 
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derived  from  it  are  in  agreement  with  one  another  and  find  experi- 
mental corroboration. 

An  important  consequence  follows  from  (59).  To  obtain  it,  we 
apply  the  Stokes  theorem  to  a computation  of  the  circulation  of  the 
vector  H around  any  closed  contour  (L)  bounding  the  surface  (cr) 

• dr  = ^ (curl  H)n  do  = — ^jn  da  (62) 

{L)  (a)  (a) 

The  integral  on  the  right  has  a simple  physical  meaning  (cf.  Sec.  10.4) : 
this  is  the  quantity  of  electricity  passing  outwards  through  (a)  in 
unit  time,  i.  e.  the  total  current  J through  (a).  Thus,  the  circulation 
of  the  vector  H around  any  closed  contour  is  proportional  to  the 
total  current  through  the  surface  bounded  by  that  contour. 

Another  important  consequence  is  obtained  if  we  take  the  diver- 
gence of  both  sides  of  (59).  We  obtain 

div  j — — div  curl  H = 0 
4tt 

(see  (51)).  This  “law  of  conservation  of  electricity”  in  the  stationary 
case  could  have  been  derived  independently  of  formula  (59) : it  means 
that  in  unit  time  just  as  much  electricity  flows  into  an}^  volume  as 
flows  out  of  it,  i.  e.  the  total  flux  of  the  vector  j through  any  closed 
surface  is  equal  to  zero  (cf.  similar  reasoning  in  Sec.  10.8). 

If  we  take  the  divergence  of  both  sides  of  (61),  we  see  that 

div  H = div  curl  A = 0 (63) 

Thus  (see  Sec.  10.7),  unlike  an  electric  field,  a magnetic  field  does  not 
have  sources  of  vector  lines,  which  means  that  magnetic  lines  of  force 
cannot  originate  or  terminate  anywhere.  (Since  we  relied  solely  on  (51), 
it  follows  that  in  the  general  case,  too,  a vector  field  can  be  the  curl 
of  some  “vector  potential”  only  if  the  divergence  of  this  field  is  zero, 
which  means  it  does  not  have  sources  of  vector  lines.  Such  fields  are 
customarily  called  solenoidal  fields .) 

To  illustrate,  let  us  compute  the  magnetic  field  of  an  infinite 
cylindrical  coil  having  n turns  in  the  winding  per  unit  length  and 
carrying  a current  J.  We  will  disregard  the  thickness  of  the  winding 
(later  on  we  will  see  that  it  is  inessential)  and  for  the  sake  of  simplicity 
will  regard  the  winding  not  as  a helical  winding  but  as  a circular  one. 
From  formula  (60)  we  see  that  in  the  given  case  the  vector  potential  A 
is  parallel  to  the  plane  of  the  turns  and,  by  the  meaning  of  the  pro- 
blem, forms  a plane-parallel  field.  But  then  from  formula  (61)  and 
from  the  definition  (43)  of  curl  it  immediately  follows  that  the  field  H 
is  everywhere  parallel  to  the  axis  of  the  coil  and  does  not  vary 
along  the  coil,  which  is  to  say  it  is  also  plane-parallel.  Now  apply 
formula  (62)  to  the  rectangular  contour  A BCD  shown  in  Fig.  159. 
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Since  the  integrals  along  the  line  segments  BC  and  DA  are  zero  and 
since  no  current  flows  through  the  surface  bounded  by  this  contour, 
we  get 

Hxh  — H2h  = 0 

(h  is  the  altitude  of  the  rectangle),  whence  Hx  = H2.  Thus,  a magnetic 
field  inside  a solenoid  (coil)  is  constant  not  only  in  height  but  also  in 
cross  section.  Similarly,  we  find  that  the  field  outside  the  coil  is  also 
constant,  and  since  at  infinity  it  is  zero,  it  is  zero  everywhere  outside 
the  coil.  Finally,  applying  (62)  to  the  rectangular  contour  EKGF , 

we  get  Hh  ~ — Jnh , whence  we  find  the  magnetic-field  intensity 

c 

inside  the  coil  to  be 

H = — Jn 

c 

We  see  that  it  does  not  depend  on  the  radius  of  the  coil  but  only 
on  the  number  of  ampere  turns  per  unit  length.  For  one  thing,  from 
this  there  follows  the  earlier-mentioned  independence  of  the  field  of 
the  winding  thickness.  We  leave  it  to  the  reader  to  obtain  an  expression 

for  the  potential  A = —rev(r  < r0),  A = —-e^r  > r0),  where  r = 

= xi  + yj,  r = |r|,  e<p  = kxr°,  and  r0  is  the  radius  of  the 
coil  (do  not  confuse  r0  with  r°  = r /r). 

The  problem  of  constructing  a magnetic  field  from  a given  system 
♦of  currents  can  be  approached  differently.  We  assume  that  the 
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magnetic  field  at  any  point  M is  created  by  each  current  element 
* J = j d£l  in  accord  with  the  Bio-Savart  law 

dR  ld]xM0M) 

c{M0M)3 

where  M 0 is  the  point  at  which  the  current  element  is  located.  The 
total  vector  H at  M is  found  by  means  of  integrating  over  all  elements 
of  current.  The  fact  that  the  formula  for  dK  is  written  in  precisely 
the  way  it  is,  is  justified  by  the  fact  that  all  consequences  of  this 
formula  are  in  agreement  with  experiment.  This  approach  is  equivalent 
to  our  earlier  approach:  from  the  Bio-Savart  law  we  can  derive 
formulas  (59)  and  (61),  and  conversely. 

Exercise 

Obtain  from  the  Bio-Savart  law  the  intensity  of  the  magnetic 
field : (a)  of  a straight  infinite  conductor ; (b)  on  the  axis  of  a sole- 
noid from  one  turn. 

11.12  Electromagnetic  field  and  Maxwell’s  equations 

Actually,  in  the  general  case,  there  exist  in  one  and  the  same 
region  of  space  both  an  electric  and  a magnetic  field;  the  result  of 
this  is  an  electromagnetic  field , which  at  every  point  in  space  (at  any 
rate,  in  a vacuum,  and  that  is  what  we  assume  here)  is  characterized 
by  two  vectors : the  electric  vector  E and  the  magnetic  vector  H. 
The  differential  equations  connecting  these  vectors  are  called  Maxwell's 
equations  and  play  a very  important  role  in  physics. 

Actually,  the  appropriate  equations  have  already  been  written 
out  for  a stationary  electromagnetic  field.  These  are  first  of  all  the 
equations 

curl  E = 0,  div  E = 4-p 

of  which  the  former  follows  from  the  potentiality  of  a steady-state 
electric  field  (Secs.  10.5  and  11.9)  and  the  latter  is  the  equation  (42) 
of  Ch.  10.  Besides,  they  are  also  the  equations 

curl  H = — j,  div  H = 0, 


(see  (59)  and  (63)). 

For  the  stationary  case  the  interrelationship  between  the  elec- 
tric and  the  magnetic  field  is  not  apparent.  The  situation  is  quite 
different  in  the  nonstationary  case  that  we  now  take  up.  It  turns 
out  that  every  change  in  the  electric  field  affects  the  magnetic  field 
and  every  change  in  the  magnetic  field  acts  on  the  electric  field,  so 
that  there  is  no  way  of  considering  these  fields  separately. 

When  considering  the  former  action  it  is  exceedingly  useful  to 
apply  the  concept  of  displacement  currents.  To  derive  the  appropriate 
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formulas  let  us  first  consider  a capacitor  (Fig.  160),  the  plates  of  which 
are  charged  with  surface  density  -f-  v f°r  the  left  one  and  — v for 
the  right  one.  Each  of  these  plates  generates  in  the  space  between 
them  an  electric  field,  which  can  be  calculated  by  the  formulas  (29) 
of  Ch.  10  so  that  the  total  field  is 

E = 4rrvi  (64) 

If  v increases,  then  the  left-hand  plate  is  approached  from  without 

by  positive  charges  that  settle  on  it  at  the  rate  of  J = , 

dt 

where  S is  the  area  of  one  plate ; negative  charges  settle  at  the  same 
rate  on  the  right-hand  plate.  If  for  a moment  we  imagine  that  the 
points  A and  B are  connected  by  a conductor  on  which  charges  do 
not  settle,  then  by  the  law  of  conservation  of  electricity  there  should 
pass  through  any  cross  section  between  A and  B one  and  the  same 
current  J.  Actually,  the  charges  do  not  pass  between  the  plates  of 
the  capacitor,  but  the  law  of  constancy  of  currents  can  be  preserved 
if  we  introduce  the  idea  of  a displacement  current  J that  flows,  as  it 
were,  from  A to  B.  By  formula  (64)  the  density  of  this  displacement 
current  is 


jdis 


/ i = dv 
5 dt 


1 BE 
47T  dt 


It  turns  out  that  in  the  formulas  relating  the  magnetic  vector  H 
and  the  current  j,  the  change  in  electric  field  has  to  be  replaced 
by  the  displacement  current  whose  density  is  computed  from  the 
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same  formula,  jdiS  = — — • And  so  for  a nonstationary  electric 
4tt  dt 

field,  instead  of  (59)  we  have  to  write  the  equation 

curl  H = — (j  + jdk)  = — j + - ^ (65) 

c c c dt 

Similarly,  also  in  the  formula  (61),  to  construct  H we  have  to  sub- 
stitute j + jdis  for  j.  But  then,  arguing  as  in  Sec.  11.11,  we  can  derive 
the  relation 

div  H = 0 (66) 

(see  (63)). 

Now  let  us  turn  to  a variable  magnetic  field.  Experiments 
(Faraday)  show  that  any  change  in  the  magnetic  field  induces  an 
electric  field,  which,  unlike  the  field  generated  by  charges,  is  a 
rotational  field.  We  now  show  that  the  corresponding  law  of  induc- 
tion is  of  the  form 

curl  E = — — — (67) 

c dt 

By  virtue  of  Sec.  11.9  the  electric  field  here  does  not  have  a poten- 
tial, i.e.  we  cannot  speak  of  a difference  of  potentials  in  the  field, 
and  the  work  done  by  the  field  depends  not  only  on  the  beginning 
and  end  of  the  path,  but  on  the  entire  trajectory. 

In  order  to  derive  formula  (67),  imagine  placed  in  a field  a closed 
conductor  (L)  bounding  a surface  (cr).  According  to  the  results  of 

Faraday's  experiments,  the  change  in  magnetic  flux  ^ H • da  gives 

(a) 

rise  to  an  emf  in  the  circuit  (L)  that  is  proportional  to  the  rate  of 
change  of  the  flux.  However,  the  indicated  emf  is  equal  to  the 
sum  of  the  elementary  emf  in  small  portions  of  the  circuit,  and  these 
small  emf  are  equal,  as  it  is  easy  to  see,  to  E • dr . Thus 

^E-dt  = -kjt^R-dfs  = 

(L)  (o)  (o) 

Here  the  minus  sign  is  taken  by  the  Lenz  law,  according  to  which 
the  induced  emf  is  counter  to  the  change  in  flux ; k is  a proportiona- 
lity factor,  which  can  be  shown  to  be  equal  to  1/c;  the  differentiation 
with  respect  to  t has  been  brought  under  the  integral  sign  as  diffe- 
rentiation with  respect  to  a parameter  (Sec.  3.6).  Transforming  the 
left-hand  side  by  the  Stokes  theorem,  we  get 

[ curl  E * da  = — — i — • da 

J c ) dt 

(o)  (o) 

whence  follows  (67)  because  (a)  is  arbitrary. 
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The  case  of  a force  that  does  not  correspond  to  a potential  is 
realized  in  a transformer.  The  current  flowing  in  the  primary  winding 
of  a transformer  sets  up  a magnetic  field.  If  the  current  in  the  primary 
winding  varies  with  time,  a variable  magnetic  field  is  set  up  which 
gives  rise  to  an  electric  field  of  precisely  that  type  (a  so-called  rota- 
tional field),  as  shown  in  Fig.  154.  This  means  that  the  force  acting 
on  a charged  body,  say  an  electron,  located  inside  the  secondary 
winding  does  not  correspond  to  any  potential  at  all. 

An  electron  in  motion  round  a circle  can  build  up  more  and  more 
energy.  This  of  course  does  not  mean  there  is  a violation  of  the 
law  of  conservation  of  energy,  since  the  accelerated  motion  of  the 
electron  will  in  turn  affect  the  primary  winding,  giving  rise  to  an 
additional  consumption  of  energy  in  the  current  sources  supplying 
it.  But  for  a single  electron  the  law  that  “the  sum  of  the  kinetic 
and  potential  energies  of  an  electron  is  a constant”  does  not  hold 
true  because  the  force  acting  on  the  electron  does  not  correspond 
to  any  potential  energy.  This  principle  of  accelerating  electrons  is 
used  in  an  electron  accelerator  called  a betatron.  Here,  the  electric 
field  accelerates  the  electrons  and  the  magnetic  field  appropriately 
bends  their  paths,  making  them  move  in  circles. 

Since  a rotational  electric  field  appears  only  during  a change  in 
the  magnetic  field,  such  an  electric  field  will  not  be  constant  for  a 
long  time,  for  when  the  magnetic  field  reaches  a maximum,  the  rate 
of  its  change  becomes  zero,  after  which  the  electric  field  vanishes 
as  well. 

Because  of  the  short  duration  of  the  electric  field,  heavy  particles 
(protons,  say)  do  not  have  time  to  build  up  sufficient  energy  in  a 
betatron.  Electrons  are  light  and  so  they  build  up  energy  very  well. 
There  are  betatrons  in  which  electrons  acquire  an  energy  of  tens 
of  millions  of  electron  volts,  which  is  the  energy  they  would  acquire 
by  passing  between  a potential  difference  of  tens  of  millions  of  volts. 
Fast  electrons  emitted  in  radioactive  transformations  were  called 
beta  rays  in  the  early  days  of  radioactive  research  when  the  physical 
nature  of  these  rays  had  not  yet  been  elucidated.  Whence  also  the 
name  “betatron”  for  the  instrument  in  which  fast  electrons  are  obtained 
by  accelerating  slow  electrons  without  the  phenomenon  of  radio- 
activity occurring. 

The  equation 

div  E = 47rp  (68) 

which  connects  the  electric  field  and  charges  (see  formula  (42)  of 
Ch.  10),  holds  true  in  the  nonstationary  case  as  well.  All  four  equa- 
tions, (65)  to  (68),  form  the  system  of  Maxwell's  equations.  To  them 
we  can  add  the  continuity  equation 

— + div  j = 0 
dt  J 


(69) 
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which  is  derived  in  exactly  the  same  way  as  the  similar  equation  (41), 
Ch.  10,  in  hydrodynamics,  and  also  one  or  another  relation  connecting  j 
and  E (the  generalized  Ohm  law,  which  in  the  simplest  cases  is  of 
the  form  j = XE,  where  X is  the  coefficient  of  electric  conductivity) 
and  including  in  the  general  case  the  action  of  external  forces. 

Exercise 

Derive  equation  (69)  from  equations  (65)  and  (68). 

11.13  Potential  in  a multiply  connected  region 

Consider  the  force  field  F.  From  both  conditions  for  the  existence 
of  a potential  — the  condition  curl  F = 0 (Sec.  11.9)  and  the  condi- 
tion that  the  work  done  in  traversing  a closed  curve  is  zero  (Sec.  10.3) 
— it  is  evident  that  the  possibility  of  disrupting  these  conditions, 
that  is,  the  possibility  of  the  existence  of  forces  of  a nonpotential 
type,  involves  considering  a function  of  two  or  three  variables 
instead  of  one.  Indeed,  in  the  case  of  one  variable  (motion  in  a straight 
line)  it  is  possible  to  return  to  the  original  point  only  by  covering  each 
section  of  the  path  twice : once  in  one  direction  and  then  in  the  return 
direction.  And  so  (if  the  force  does  not  depend  on  the  time  or  on  the 
velocity  but  only  on  the  position  of  the  body)  in  the  case  of  motion 
in  a straight  line,  the  work  of  the  force  is  zero  if  the  path  ends  at  the 
point  from  which  it  started. 

In  arbitrary  motion  in  a plane  or  in  space,  it  is  possible  to  start 
out  from  the  initial  point  along  a certain  curve,  reach  a terminal 
point,  and  then  return  to  the  beginning  via  a different  curve.  It  may 
then  happen  that  the  work  is  not  equal  to  zero.  Therein  lies  the 
difference  between  motion  along  a single  straight  line  when  the  poten- 
tial energy  U(x)  corresponds  to  any  force  F(#),  and  motion  in  a plane 
or  in  space  where  the  potential  energy  may  not  exist  at  all. 

Now  let  us  examine  the  motion  of  a body  in  a plane  or  in  space. 
If  we  force  the  body  to  move  along  a single  definite  open  curve, 
then  we  again  have  the  same  motion  in  a straight  line.  Indeed,  in 
that  case  we  will  describe  the  position  of  the  body  on  the  curve  by 
the  path  traversed  along  that  curve  (the  path  is  reckoned  from  some 
chosen  point  on  the  curve).  Then  we  actually  have  to  do  with  one 
variable,  the  path  length  of  the  curve.  Therefore,  when  considering 
the  open  second  winding  of  a transformer,  we  can  speak  of  a diffe- 
rence of  potentials  (or,  better,  of  an  electromotive  force)  at  the  ends. 
But  if  we  place  a closed  circle  in  the  first  winding,  then  there  will 
not  be  any  definite  potential  difference  in  the  circle  between  the  two 
points  A and  B (Fig.  161).  The  work  done  in  carrying  a charge  from  A 
to  B depends  on  which  curve  is  traversed,  ADB  or  ACB. 

Now  suppose  the  body  is  not  compelled  to  move  along  a fixed 
curve  but  can  move  to  the  side  as  well.  Then,  for  a potential  to 
exist,  it  is  required  that  the  work  of  the  force  in  moving  the  body 
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Fig.  162 


around  any  infinitesimal  closed  curve  be  equal  to  zero.  By  virtue 
of  Sec.  11.7,  this  is  equivalent  to  the  condition  curl  F = 0 (since 
such  work  is  equal  to  the  circulation  of  the  vector  F around  the  given 
contour),  which  means  that  the  field  F must  be  irrotational. 

But  let  curl  F = 0 not  throughout  the  space  but  only  in  a cer- 
tain portion  (region)  (G).  Then  the  work  done  around  a finite  closed 
contour  (L)  lying  in  (G)  need  not  necessarily  equal  zero ! We  can  only 
assert  that,  given  an  infinitesimal  deformation  of  the  closed  contour, 
the  work  does  not  change  since  the  work  done  around  the  contour 
ABXCDA  (Fig.  162)  differs  from  the  work  done  around  the  contour 
AB2CDA  by  the  amount  of  work  done  round  the  contour  AB^B^A, 
which  is  equal  to  zero.  By  carrying  out  such  a deformation,  we  can 
obtain  a contour  lying  in  (G)  and  substantially  differing  from  the  ori- 
ginal contour,  although  the  work  of  the  force  F done  around  these 
contours  is  the  same. 
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From  this  it  is  clear  that  if  the  contour  (L)  can,  within  the  limits 
of  (G),  be  contracted  to  a point  via  a continuous  deformation,  then 
the  work  of  the  force  around  ( L ) is  zero.  This  is  true  because  after 
such  a contraction  the  work  will  clearly  be  zero  (for  there  will  be  no 
motion).  But  since  the  work  did  not  change  during  the  deformation 
of  the  contour,  it  was  zero  for  the  original  contour  as  well. 

The  region  (G)  can  have  the  property  of  being  simply  connected , 
which  means  that  any  closed  curve  in  such  a region  can,  by  means 
of  a continuous  deformation,  be  contracted  into  a point  without 
touching  the  boundary  of  the  region.  For  example,  the  interior  of  a 
circular  cylinder,  the  interior  or  exterior  of  a sphere  are  all  simply 
connected  regions.  By  contrast,  the  exterior  of  an  infinite  circular 
cylinder,  the  interior  or  exterior  of  a torus  (the  surface  of  a doughnut) 
are  nonsimply  connected  regions. 

If  the  region  (G)  is  simply  connected,  then  it  follows  from  the 
foregoing  that  in  this  case  the  work  of  a force  over  a finite  closed 
curve  lying  in  (G)  is  of  necessity  equal  to  zero.  If  we  disregard  the 
specific  physical  interpretation  of  a field,  then  we  can  draw  the  follow- 
ing general  mathematical  conclusion  (cf.  Sec.  11.9):  in  a simply 
connected  region,  an  irrotational  field  is  necessarily  potential,  that 
is,  noncirculatory. 

In  contrast,  if  the  region  is  nonsimply  connected,  then  the  circu- 
lation of  an  irrotational  field  may  be  different  from  zero  if  the  con- 
tour at  hand  cannot  be  contracted  into  a point  by  a continuous  defor- 
mation while  remaining  within  the  region,  which  is  to  say  that  an 
irrotational  field  may  be  circulatory.  Such  for  instance  are : a magnetic 
field  inside  a toroidal  coil  (with  the  planes  of  the  turns  passing  through 
the  axis  of  rotation)  carrying  direct  current;  a velocity  field  with 
irrotational  fluid  flowing  round  a closed  channel,  and  so  forth.  If  we 
construct  the  potential  using  formula  (55),  it  will  be  ambiguous:  if 
we  traverse  a closed  contour,  then  the  circulation  of  the  field  around 
this  contour  will  be  subtracted  from  the  potential.  When  we  speak 
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»»l  a potential  field,  we  have  in  mind  only  a single-valued  potential; 
ili us,  an  irrotational  field  in  a nonsimply  connected  region  need  not 
necessarily  be  potential. 

Of  considerable  interest  is  a toroidal  transformer  with  a circular 
iron  core.  When  an  alternating  current  is  passed  through  the  winding, 
an  irrotational  electric  field  is  formed  in  the  outer  space,  this  field 
lias  a nonzero  circulation  around  the  contour  (L)  coupled  to  the  torus 
(Fig.  163). 

Exercise 

Prove  that  the  plane  field  defined  by  formula  (57)  outside 
the  circle  (L)  with  centre  at  the  origin  is  an  irrotational  field. 
What  will  happen  if  the  potential  is  constructed  by  for- 
mula (55)? 

ANSWERS  AND  SOLUTIONS 

Sec.  11.1 

^ 1 3 k 

1.  Tb  = -2i  + j + 4k,  AC  = 3j  + 5k,  TBxAC  =-214  = 

0 3 5 

= _7i+  10j  — 6k,  SAABc  =~\ABxAC\=  ^ U2  + 102  + 62 = 

: 1 1/785  = 6.8. 

2. 

2—10 

2.  Since  (axb)*c=  3 0 2 = 9 > 0,  the  triad  is  right- 

— 1 —1  1 

handed. 

3.  (ixi)xj  = 0,  ix  (ixj)  = — j,  that  is,  (ixi)  X j =£  ix  (ixj). 

Sec.  11.2 

If  (1)  b = 0,  or  (2)  b is  applied  to  point  0,  or  (3)  b is  directed 
exactly  to  O or  from  0 so  that  the  straight  line  on  which  the 
vector  b lies  passes  through  0.  The  third  case  embraces  the  first 
two  cases. 

Sec.  11.3 

1.  T = ^ ~~  ! the  substitution  of  ^ from  equation  ( 1 5) 

rmax 

leads  to  the  integral  T = 2 ^ + ~ — dr.  Calcu- 

rmin 

lating  for  - -gj-  < E < 0 yields  T = izk  ^ • The  semi- 
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major  axis  a of  the  ellipse  is  equal  to  — (— - — f — — ) or,  after 

2 U — e 1 + e) 

k 2— 

transformations,  — — . For  k - ymiM  we  get  T — "-a32. 

2 E 6 faM 

This  is  the  statement  of  Kepler's  third  law  in  the  general  case. 
2.  Using  formula  (17)  gives  the  value 


rraax 


rmin 


for  the  period  of  variation  of  the  radius.  The  difference  is  due 
to  the  fact  that  in  the  type  of  motion  under  consideration  here 
the  period  of  variation  of  the  radius  r{t)  is  half  the  period  of  revo- 
lution of  the  particle. 

Sec.  11.4 

(a)  Passing  to  the  polar  coordinates  (a,  9)  (see  end  of  Sec.  4.7), 
we  get 

h 2tt  R 

{Jz) cyl  = ^ dz  ^ ^9  ^ poc2a  da  = 27rp  ?— h = 

000 

(b)  Denote  the  linear  density  of  the  rod  by  X;  then 

L12 

(jz) rod  = 2 J x2X  dx  = 2X  ~ 

0 

Incidentally,  these  two  results  make  it  possible  to  analyze  more 
precisely  the  rotation  of  a dancer. 

Sec.  11.5 

1.  In  the  two-dimensional  case,  if  the  basis  e'  is  obtained  from  the 
basis  by  a rotation  through  the  angle  9,  then  ocu  = a22  = cos  9, 
a12  = — a21  = sin  9.  If  the  tensor  pi}  is  symmetric,  then 

P'12  = XiMiPti  = (P22  - i’ll)  sin  cp  cos  9 + pl2  (cos2  9 - sin2  9) 

= \ (P22  ~ i’ll)  sin  29  + p12  cos  2? 

Choosing  tan  29  = 2 p12l(Pu  — P22)'  we  get  P12  = 0>  which  is 
what  is  required. 

2.  Here,  v)13  = k (small)  and  the  others  are  t)(}  = 0;  therefore 

P12  = P21  = Y12  = Y21  = k/2,  the  others  are  (3^  = y^  = 0. 

Applying  the  solution  of  Problem  1 , we  find  that  to  the  tensor  $tj 

is  associated  a tension  of  1 + y dmes  along  the  straight  line 
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xv  = x2  and  a compression  of  1 — times  along  the  straight 

line  xx  = —x2]  by  virtue  of  the  solution  of  the  third  exercise 
of  Sec.  9.5,  to  the  tensor  there  corresponds  a rotation  through 

the  angle  a = — ^ . The  entire  picture  can  be  regarded  as 

a plane  picture  (in  the  plane  xv  x2)  or  as  a plane-parallel 
picture. 

Sec.  11.6 

1.  Yes. 

2.  The  basic  formula  is:  [ex,  e2,  e3]  — e4.  From  it  there  follow  three 
other  formulas  via  circular  permutations.  For  an  interchange 
of  two  factors,  the  result  is  multiplied  by  —1.  The  formula  for 
the  product  of  any  three  vectors  is  similar  to  (3),  but  the  deter- 
minant in  it  is  of  the  fourth  order.  (It  can  be  verified  that  the 
vector  product  is  perpendicular  to  all  the  vectors  being  multi- 
plied, is  equal  in  absolute  value  to  the  volume  of  a parallele- 
piped constructed  on  them,  and  forms  with  them  a quadruple 
of  the  “same  sense”  as  the  quadruple  ex,  e2,  e3,  e4.) 

Sec.  11.7 

^ Ax^S  ^ AXt/S 

1.  curl  A (M)  = - ^ = - lim  ^ 

dQ.  (An  )->M  AQ 


<t>  f(z)  dz 

= <b  (u 

-J—  iv)  {dx  -j-  t 

dy) 

= fo  (u  dx  — v dy)  + t 

'Uv 

J 

dx  ~f~ 

(L) 

J 

(L) 

(L) 

+ u dy). 

If  we 

consider  the 

field  A = ui  — vj,  then 

the 

inte- 

gral  (u 

dx  — v dy)  is 

equal 

to 

the  circulation  of 

this 

field 

{L) 

around  ( L ) and, 

for  this 

reason,  by  the  Stokes  theorem 

to  (u 

dx  — 

J 

(*■) 

— v dy) 

= i curl  A*  dS. 

But 

A = 

= Ax{x,  y)  i + Av(x,  j 

v>)  b 

that 

(5) 

is,  by  formula 

(«), 

curl  A = 

-^]k 

= (- 

dv 

-*)k  = o 

\ dx 

dy  J 

l 

~d~x 

dy) 

by  virtue  of  the  second  formula  of  (17),  Ch.  5.  Which  means 
this  last  integral  is  equal  to  zero.  The  integral  b(v  dx  + u dy) 


[L) 


is  investigated  in  similar  fashion. 
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Sec.  11.8 

1.  u curl  A grad  u X A,  B • curl  A — A • curl  B. 

2.  v = (4fi)-1&  • da. 

(do) 


Sec.  11.9 

curl  A 


i j k 

_£  _d_  J_ 
dx  By  dz 
2 XZ  y2  X 2 


= 0.  Applying  formula  (55),  we  choose 


M0  = (0,  0,  0)  and  the  path  MM0  consisting  of  line  seg- 
ments connecting  the  points  (x,  y,  z)  and  (0,  y,  z ),  (0,  y,  z) 
and  (0,  0,  z),  and  (0,  0,  z)  and  (0,  0,  0).  Whence 

0 0 0 3 

y(x,  y>  z)  — ^ 2*2  dx  + ^ y2  dy  + ^x2  J dz  — —x2z  — —■ 

x y z 


Sec.  11.10 

Since  during  time  dt  the  vector  dr  will  pass  into  the  vector 
dr  + dx  dt,  the  condition  of  invariance  of  the  direction  dr  is 
of  the  form  dx  1 1 dr . Since  v = (^k)  X — > it  follows  that  dx  = 

r2 

= (/>k)  X (—  — 2r'dr  xY  and  from  dx  II  dr  we  get  — — 2r  * dt  r_L^r, 

that  is,  (r2  dr  — 2(r  • dr)  r)  • dr  = 0.  And  so  r2(dr)2  — 2(r  -dr)2  = 0, 

cos  (r,  dx)  = T dr  = ± — , that  is,  dr  must  form  an  angle  of 
r\  dr  | 2 

±45°  with  r. 


Sec.  11.11 

(a)  Let  the  current  J flow  along  the  2-axis  and  compute  the 
vector  H at  the  point  in  the  *y-plane  with  radius  vector  r. 
Then 


dn  = 


c(y2  + £2)3/2 


(Jdl  kx(r-Ck)) 


Jr  at 


e(r2  + £2)3/2 


(kxr°). 


00  tt/2 

H = [ (kxr°)=  ( Lssilll  (kXr°)  = 2-L  (kxr°) 

J c(r 2 + C2)9'2  J rc  X ' cr  V ' 

— 00  — 71/2 

(here  we  put  £ = r tan  s).  We  thus  arrive  at  a law  that  has  already 
been  considered. 
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(b)  Let  the  current  J flow  around  a circle  (in  the  *y-plane)  of 
radius  R with  centre  at  the  origin  and  compute  the  intensity 
at  the  point  (0,  0,  z).  Then,  representing  the  circle  in  parametric 
form,  x = R cos  y = R sin  9,  we  get 


l 

c{z2  + 2)3'2 


(7  drx{zk  — r)) 


2tt 

= ^ c{.2^R2f2^~R  sin  ? i + R cos  9 j)  d-. p (2k  - R cos  9 i 
0 

2n 

- R sin  9 j))  = ^ (2  cos  9 i + 2 sin  9 j + #k)  d 9 

= 2*JR2  k 

c(z2  + i?2)3'2 

Incidentally,  it  is  easy  from  this  to  derive  a formula  for  a magne- 
tic field  of  an  infinite  solenoid. 


Sec.  11.12 

Take  the  divergence  of  both  sides  of  (65). 

Sec.  11.13 

The  equality  curl  v = 0,  is  verified  via  formula  (43).  Since  the 
left-hand  side  of  (57)  is  equal  to  grad  (p  arctan  (y/x)),  then  by 
formula  (55)  we  get  y{x,  y)  — —p  arctan  {y/x)  + constant. 
This  "potential”  is  multi-valued;  in  a traversal  of  the  circle 
(L)  it  receives  an  increment  —2 up,  which  means  the  original 
irrotational  field  is  not  a potential  field. 
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Chapter  12 

CALCULUS  OF  VARIATIONS 


Basic  courses  in  mathematical  analysis  (see,  for 
example,  HM,  Sec.  2.6)  take  up  the  question  of 
seeking  the  extrema  of  a function  of  one  variable, 
and  in  Sec.  4.6  we  considered  the  problem  of 
finding  the  extrema  of  a function  of  several 
variables,  in  other  words,  we  considered  pro- 
blems with  a finite  number  of  degrees  of  freedom. 
The  basic  aim  of  the  calculus  of  variations  is  to 
obtain  general  methods  for  finding  extrema 
in  problems  involving  an  infinite  number  of 
degrees  of  freedom.  In  this  chapter  we  will 
need  certain  facts  from  the  theory  of  functions  of 
several  variables  (mainly  from  Secs.  4.1  and  4.6). 

12.1  An  instance  of  passing  from  a finite  number  of  degrees  of 
freedom  to  an  infinite  number 

Let  us  consider  a chain  of  material  particles  stretched  between 
two  fixed  points,  the  particles  being  successively  connected  by  the 
same  kind  of  springs.  Suppose  the  particles  are  acted  upon  by  small 
transverse  forces  that  deviate  the  chain  from  the  unloaded  state  of 
equilibrium  (Fig.  164) ; for  the  sake  of  simplicity  we  assume  that  the 
displacements  of  the  particles  are  perpendicular  to  the  line  of  the 
unloaded  state,  i.e.,  the  *-axis.  Let  us  find  the  loaded  state  of  equi- 
librium that  is  characterized  by  deviations  y L,  y2  , . . . , of  the  particles 

from  the  #-axis. 

To  do  this,  we  determine  the  potential  energy  U of  the  chain  in 
any  (not  necessarily  equilibrium)  deviated  state,  reckoning  U from 
the  unloaded  (but  already  taut)  state  of  the  chain.  This  energy  is  made 
up  of  two  parts: 

U=  Uel  + Uext  (1) 

The  first  represents  the  work  spent  on  overcoming  the  elasticity  of 
the  springs,  the  second,  the  work  to  overcome  external  forces.  The 
first  is  proportional  to  the  elongation  A l of  the  chain : 

Uel  = PM  (2) 

where  P is  the  stretching  force  of  the  chain;  we  will  assume 
that  this  force  remains  constant  under  deviations.  The  elonga- 
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Fig.  164 


tion  of  the  spring  connecting  particles  Mt  and  Mi+1  is  equal  to 
(see  Fig.  164) 


A/t.  = yA2  + (A y,)2  - h = *(y  1 + - l) 


= h.±  (A ytY 


(yui  - yi )2 


a2 


2A 


Here  we  took  advantage  of  the  approximate  formula  ]f  1 + a = 
= 1 d a which  holds  true  for  small  | a |,  i.e.  in  our  case  for  small 

By  (2), 
h* 


iu-p'Z  = 

t Zh  2h  \ 


i+1 


Vi)2 


It  is  still  easier  to  find 


^t  = E)'<K.)  = -E^ 

t i 

Thus,  by  (1), 

U=^rJ2  (Ti+i  - y()2~I2 Fi)’i  (3) 

2h  i i 

In  the  equilibrium  position,  the  potential  energy  must  have  a 
“stationary”  value,  that  is,  infinitesimal  changes  in  the  coordinates 
must  lead  to  higher  order  changes  in  the  potential  energy;  in  other 
words,  derivatives  of  the  potential  energy  with  respect  to  the  coordi- 
nates in  the  equilibrium  position  must  equal  zero.  And  it  is  easy  to 

verify  that  the  derivative  -^is  equal  (with  sign  reversed)  to  the 

dyi 

total  force  acting  on  the  ith  particle.  Thus,  if  the  derivative  were 
different  from  zero,  this  would  indicate  an . uncompensated  force, 
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which  is  impossible  under  equilibrium.  Now  the  potential  energy  must 
have  a minimum  in  the  stable  position  of  equilibrium.  This  is  necessary 
so  that  a displacement  from  the  equilibrium  position  can  generate  a 
compensating  force  that  will  return  the  s}^stem  to  the  equilibrium 
position ; in  other  words,  so  that  such  a displacement  should  definitely 
require  expending  a positive  amount  of  work. 

Since  on  the  right-hand  side  of  (3)  the  coordinate  yt  appears  in 
three  terms: 

j:  [(y<  - J'(-i)2  + (tt+1  - ^)2]  - F,y( 

2 h 

it  follows  that 

= - T (yt+ 1 - 2 yt  + y<_ 1)  - Ft 

h 

and  the  steady-state  condition  yields 

(yi+i  — 2 yt  + yt- 1)  + Ft  = 0 (*  = 1.  2,  ...,  n — 1)  (4) 

h 

What  we  have  is  a system  of  n — 1 algebraic  equations  of  the  first 
degree  in  n — - 1 unknown  coordinates  yv  y2 , yn_x . Note  that  for 
i = 1 we  have  to  put  y0  equal  to  zero  in  (4)  (this  is  a fixed  point), 
and  for  i = n — 1,  yn  enters  into  (4)  and  is  also  put  equal  to  zero. 
Solving  system  (4),  we  get  the  desired  position  of  equilibrium. 

Consider  the  following  example.  Let  all  forces  Fi  = F be  the 
same.  Rewrite  system  (4)  as 

hF  . 

yz-yi  - - -j  + y» 
y»  - 2j2  + yi  = - — » 

O I hF 

y i-zy3  + y2  = - 


Check  to  see  that  by  adding  the  first  and  second  equations,  the  first, 
second,  and  third,  and  so  forth,  we  get 

hF  . 

y%-yi  = - -j  + yv 

y*  - yt,  = - 2 -y  + yv 
y*  - y3  = - 3 -y  + yv 
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la  the  first  equation,  by  transposing  yx  to  the  right  side  and  then 
carrying  out  the  same  procedure  of  addition,  we  get,  using  the  easily 

verifiable  equation  1 + 2 + ...  -f-  k ~ — k(k  + 1), 

2 


hF  . 0 

y * = — — + ty  i. 

2-3  hF  , , 

>■>  = - — T + iyi’ 


3-4  AF  . . 

* r t + iy‘‘ 


In  the  general  form. 


Ji  = 


(i  - 1)  i hF 


+ iyx 


(5) 


Since  y„  must  turn  out  equal  to  zero,  it  follows  that 


(n  - 1 )n  hF  (n-  1)  hF 

! b «v,  = 0,  whence  v,  = — 

2 p 2 p 


Substituting  into  (5),  we  finally  get 

( i — 1)  i hF  . (n  — 1)  hF  i(n  — i)  hF  , ,, 

Vj  = — f-  t = (6) 

' 2 P 2 P 2 P ' ' 

(i=  b 2,  1) 

the  formula  holds  true  for  i = 0 and  for  i = w as  well. 

The  system  at  hand  is  determined  by  the  deviations  of  n — 1 
of  its  particles,  which  means  that  it  has  n — 1 degrees  of  freedom. 
By  increasing  n for  a given  length  l of  the  chain,  we  at  the  same  time 
increase  infinitely  the  number  of  degrees  of  freedom,  and  in  the  limit 
we  obtain  from  the  chain  a continuous  string,  which  constitutes  a 
system  with  an  infinite  number  of  degrees  of  freedom  (because  when 
considering  a string,  we  can  arbitrarily  specify  the  deviation  of  any 
number  of  its  points).  To  summarize,  then:  from  a discrete  pointwise 
system  of  particles  we  obtain,  in  the  limit,  a continuous  medium  with 
a continuously  distributed  mass. 

Now  let  us  see  how  the  expression  for  potential  energy  and  the 
condition  of  static  equilibrium  transform  in  the  limit.  We  will  assume 
that  the  outer  transverse  force  is  distributed  along  the  string  with  a 
certain  density  f(x)  so  that  there  is  a force  f(x)h  for  the  small 
length  h.  Rewrite  (3)  for  the  potential  energy  as 
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For  large  n,  that  is,  for  small  h,  we  can  replace  — — by  y\  and 

h 

the  potential  assumes  the  form 


= £[7 y'*—fy\  A* 

i * J i 


since  h = Ax . But  this  is  an  integral  sum,  and  in  the  limit,  as  h ->  0, 
this  sum  becomes  the  integral 

1 

U = ^{yT-f{x)y\dx  (7) 

0 

This  is  the  expression  for  the  potential  energy  in  the  case  of  a 
continuous  string.  And  so  the  problem  of  finding  the  shape  of  equili- 
brium of  a taut  loaded  string  can  be  formulated  mathematically  thus : 
to  find  the  function  y(x)  (which  describes  the  shape  of  the  string) 
that  satisfies  the  conditions  of  attachment, 

MO)  = 0,  y(l)  = 0 (8) 

and  that  imparts  a minimal  value  to  the  integral  (7).  Since  there  are 
an  infinitude  of  degrees  of  freedom  in  the  choice  of  such  a function, 
this  problem  no  longer  belongs  to  the  theory  of  functions  of  several 
variables,  but  to  the  calculus  of  variations. 

Now  let  us  see  what  the  condition  (4)  of  static  equilibrium  goes 
into  in  the  limit ; (4)  can  now  be  rewritten  as 

P yt»-2yt±yi= i +fi  = 0 (9) 

hl 

Recall  that  the  ratio  (Sec.  2.2) 

yt+i  - 2yi  4-  yt-x 
h2 

— the  so-called  second  divided  difference  — is  close  to  the  value  of  the 
second  derivative  y"  for  small  h . Thus  in  the  limit,  as  h ->  0,  we  get 
from  (9) 

Py"  +/(*)  = 0,  that  is,  y"  = - 1 f{x)  (10) 

and  we  have  to  find  the  solution  y(x)  of  this  equation  that  satisfies 
the  conditions  (8).  We  solved  this  problem  in  Sec.  8.8. 

And  so  the  variational  problem  of  finding  the  minimum  of  an 
integral  under  the  conditions  (8)  reduced  to  solving  the  differential 
equation  (10)  under  the  same  conditions.  It  is  to  be  stressed  that  in 
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deriving  equation  (10)  we  did  not  make  use  of  the  fact  that  it  was 
precisely  the  minimum  of  integral  (7)  that  was  being  sought,  because 
when  deriving  the  conditions  (4)  we  only  made  use  of  the  condition 
t fiat  expression  (3)  be  stationary.  This  means  that  (10)  is  the  con- 
dition of  static  equilibrium  in  the  problem  at  hand.  However,  it  is 
easy  to  verify  that  in  the  given  example  the  equilibrium  is  stable  and 
thus  the  potential  energy  (7)  does  not  merely  assume  a stationary  value 
for  the  solution  but  a minimal  value.  Indeed,  from  physical  reasoning 
it  is  clear  that  there  is  some  position  of  the  string  that  corresponds  to 
a minimum  of  potential  energy.  But  in  Sec.  8.8  we  saw  that  under 
conditions  (8)  the  equation  (10)  has  only  a unique  solution,  which  means 
that  this  solution  is  the  one  that  makes  the  integral  (7)  a minimum. 

Let  us  consider  an  example.  Let  f(x)  ^nf0  = constant.  Then  equa- 
tion (10)  is  readily  integrated: 

y'  = — ^ x + ci>  y — — ~ — + Cix  + 

From  the  conditions  (8)  we  get 


whence,  finally, 

y 

(Obtain  this  same  expression  from  the  solution  (6)  for  a discrete 

case  by  putting  F = fQh,  n — — , i = — . I 

h h ) 

We  conclude  with  a more  general  case  where  the  particles  M{  (and, 
in  the  limit,  the  points  of  the  string)  are  acted  upon  also  by  an  elastic 
force  tending  to  return  them  to  the  nonloaded  state  of  equilibrium, 
with  coefficient  of  elasticity  K (cushion).  Here,  the  work 

E \{Ky)dy  = ^^ 

0 

is  done  to  overcome  the  elasticity  of  the  cushion,  and  so  the  expression 
for  the  potential  energy  becomes 

^ = |E  (yM  - Vi)2 + 1 Eyf  - E^i 

Zn  i Z i i 

(instead  of  (3)).  The  condition  of  stationarity,  i.e.  static  equilibrium, 
yields 

T (Vi+i  — 2\>i  + yf_i)  — Kyi  + F{  = 0 

h 

(i  = 1,  n — 1) 


C2  = 0,  C,  = ZL 


_ /o  l 


P 2 


fo  _ff  , fo  J_  x _ /,*(/  — x) 
P 2 ' P 2 2 P 
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When  passing  to  a continuous  string  it  is  natural  to  introduce  the  con- 
cept of  “linear  density  of  the  coefficient  of  elasticity”  k so  that  the 
force  of  elasticity  per  length  h of  the  string  is  equal  to  khy.  Then  in- 
stead of  (7)  and  (10)  we  get  the  relations 

i 

= $[f  (yT-  + ±y*-Ax)y]dx,  (ii) 

0 

Py"  - ky  +f(x)  = 0 (12) 

Thus,  here  too  the  solution  of  the  variational  problem  reduces  to  the 
solution  of  the  boundary-value  problem  for  the  differential  equation. 

Exercise 

Solve  the  system  (4)  under  the  conditions  (8)  in  the  general  case. 
Obtain  the  solution  for  a continuous  string  as  n oo. 

12.2  Functional 

The  next  variational  problem,  which  arose  at  the  end  of  the  17th 
century  and  was  solved  by  Leibnitz,  L'Hospital  and  Newton  indepen- 
dently, demonstrated  the  force  of  the  burgeoning  mathematical 
analysis. 

Suppose  a particle  M acted  upon  by  gravity  rolls  without  friction 
with  a zero  initial  velocity  from  point  A to  point  B along  a curve  ( L ) 
(Fig.  165).  How  can  this  curve  be  chosen  so  that  the  descent  occurs 
in  minimum  time?  For  an  analytic  statement  of  the  problem,  denote 
the  unknown  equation  of  the  curve  by  y = y(x) ; then  the  function 
y(x)  must  first  of  all  satisfy  the  conditions: 

y(a)=ya>  y(b)=yb  (13) 

where  ya  and  yb  are  the  ordinates  of  the  specified  points  A and  B. 
The  velocity  of  M at  any  time  is  readily  determined  by  proceeding 
from  the  law  of  conservation  of  energy: 

= mg(ya  — y),  whence  v = ]jl g{ya  — y) 

The  horizontal  component  of  the  velocity  is 

dx  dx  ds  dx  dx  1 ir—, ; 1 

— = — t = v T==n&{ya  - y)  ,/r— 72 

dt  ds  dt  ds  ydx2-\-dy“  yl  + y 2 

Expressing  dt  in  these  terms  and  integrating,  we  get  the  total  time 
of  descent: 

T = \ - >/l  +y'2  - dx,  (H) 
l V2g(ya  - y) 

a 
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Thus,  it  is  required  to  choose,  from  among  all  functions  that  satisfy 
the  conditions  (13),  that  function  for  which  the  integral  (14)  assumes 
the  smallest  possible  value. 

The  foregoing  problem,  like  the  problems  examined  in  Sec.  12.1 
for  a continuous  string,  are  typical  problems  of  the  calculus  of  variations. 
Two  features  characterize  such  problems.  First  of  all,  they  are  problems 
that  involve  an  extremum  (maximum  or  minimum),  which  is  to  say, 
they  are  problems  in  which  it  is  required  to  make  a certain  numerical 
parameter  an  extremum  ( U in  the  problems  in  Sec.  12.1,  T in  the 
problem  of  the  curve  of  quickest  descent).  We  have  already  solved 
extremum  problems;  for  functions  of  one  variable  the  desired  value 
is  a certain  number,  namely  the  value  of  the  independent  variable, 
and  for  functions  of  several  variables  the  desired  element  is  a set  of 
numbers.  In  contrast  to  this,  in  the  foregoing  problems  of  this  chapter 
we  did  not  seek  a number  or  a set  of  numbers  but  a function  y{x), 
such  that  the  indicated  numerical  parameter  is  expressed  in  terms  of 
all  values  of  this  function  (via  formula  (7)  or  (11)  of  Sec.  12.1  and 
(14)  in  the  problem  just  discussed). 

There  are  many  other  problems  involving  an  extremum  in  which 
we  seek  a function  (geometrically  speaking,  a curve  or  a surface)  or 
a set  of  functions.  All  such  problems  constitute  the  subject  matter  of 
the  calculus  of  variations. 

Calculus  of  variations  problems  can  also  be  examined  from  the 
following  point  of  view.  When  selecting  a value  of  the  independent  va- 
riable there  is  one  degree  of  freedom.  If  we  seek  a set  of  n values  of 
such  variables  (as  in  Chapter  4),  then  there  are  n degrees  of  freedom. 
Now  if  we  vary  the  functional  relationship  in  an  arbitrary  manner, 
then  there  are  an  infinitude  of  degrees  of  freedom  ; indeed,  it  is  possible, 
in  an  arbitrary  fashion,  to  specify  the  values  of  a function  for  any  num- 
ber of  values  of  the  independent  variable.  This  means  that  the  problems 
of  the  calculus  of  variations  are  problems  involving  extrema  for  the 
case  of  an  infinite  number  of  degrees  of  freedom  in  the  choice  of  the 
desired  object. 

The  following  scheme  is  characteristic  of  variational  problems. 
There  is  a certain  scalar  parameter  / (this  is  U in  Sec.  12.1  and  T 
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in  the  last  problem)  that  is  expressed  by  a definite  formula  (7),  (11) 
or  (14)  in  terms  of  an  unknown  function  y(x)  (so  far  we  confine  our- 
selves to  functions  of  one  variable)  that  has  to  be  chosen.  This  function 
is  more  or  less  arbitrary,  although  it  satisfies  certain  conditions: 
for  instance,  the  conditions  (8)  in  Sec.  12. 1 and  (13)  in  this  section,  and 
also  the  requirement  of  continuity. 

A law  of  this  nature,  according  to  which  with  every  function  of 
a definite  class  of  functions  there  is  associated  the  value  of  a certain 
scalar  parameter,  is  called  a functional.  Thus,  the  basic  calculus  of 
variations  problem  is  a problem  involving  finding  the  extremum  of  a 
given  functional. 

To  clarify  the  concept  of  a functional,  recall  once  again  that  of 
a function.  Consider,  say,  the  formula  y = x 2.  By  this  formula,  every 
value  x is  associated  with  a specific  value  y:  for  x = 2 we  have  y = 4, 

for  x = — -j  we  have  y = etc.  We  have  a law  according  to  which 

certain  numbers  x are  associated  with  certain  numbers  y.  The  formula 
y = x3  defines  a different  law,  i.e.  a different  function,  the  formula 
y = sin  x a third  law,  etc. 

The  simplest  example  of  a functional  is  the  definite  integral. 
Consider,  for  example,  the  formula 

i 

I = ^y*dx  (y  = y(x))  (15) 

0 

Substituting  various  concrete  functions  in  place  of  y(x),  we  get  con- 
crete numerical  values  for  /.  For  instance,  choosing  y = x2,  we  get 

i i 

I = ^ (x2)2  dx  = ^ xAdx  = ~ j1  = -i  = 0.2 

0 0 

Taking  y = x3,  we  get  I = ~ = 0.143;  choosing  y = sin  x,  we  get 

I — - — *m  1 = 0.273,  and  so  forth.  Thus,  formula  (15)  specifies 

a law  by  which  every  function  y(x)  is  associated  with  a value/,  which 
means  formula  (15)  defines  a functional.  The  formula 
i 

1 = $ dx  \y  - y(*),  y'  = %]  ( 16) 

0 

defines  a different  functional,  and  the  formula 

3 

I = ^y-dx  (y  = y(x)) 

a third  functional  (note  the  limits  of  integration),  and  so  forth. 
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To  summarize ; if  numbers  are  associated  with  numbers,  we  speak 
nf  a function.  If  functions  are  associated  with  numbers,  we  speak 
<>!  a functional.  Recall  (Sec.  6.2)  that  if  functions  are  associated  with 
I unctions,  we  have  an  operator . 

The  functional  (16)  is  called  a linear  functional ; this  means  that 
when  the  functions  y(x)  are  added,  the  values  of  the  functional  I are 
also  added: 

i i i 

( x(yx  + y2)'  dx  = ( xy[  dx  + xyf2  dx 
o oo 


The  functional  (15)  is  a nonlinear  (quadratic)  functional. 

In  investigating  a functional  it  is  sometimes  important  to  deter- 
mine how  its  value  varies  under  a small  change  in  the  function  on  which 
it  is  dependent.  Let  us  consider  this  problem  using  (15).  Let  the  right- 
hand  member  be  represented  by  a function  y(x),  and  then  by  a new 
function  y(x)  + Sy(#),  where  8y(x)  — the  variation  of  y — is  an  arbi- 
trary function  that  assumes  small  values.  (For  example,  we  could  first 
have  y = x 2,  and  then  y = x2  + a#3,  where  the  constant  a is  small.) 
Then  the  value  of  the  functional  also  changes  slightly  and  becomes 


i 


( [y  + Sy]2  dx  = 


0 


X X 

^y2  dx  -f-  2 dx  + 


Thus,  the  increment  in  this  value  is 


A/  = 


2 dx 


0?) 


If  for  the  time  being  we  fix  the  function  y(x)  and  change  its  va- 
riation, then,  depending  on  $y,  the  first  term  in  the  right-hand  member 
of  (17)  is  a linear  term,  whereas  the  second  one  is  a quadratic  term  (in 
the  general  case  we  also  have  terms  of  higher  powers  as  well).  Since 
the  values  of  By  are  small,  the  main  role  in  the  right  member  is  played 
by  the  linear  term,  while  the  quadratic  term  is  of  higher  order. 
This  linear  term  in  the  increment  of  the  functional  is  called  the  variation 
of  the  functional  and  is  denoted  by  81 ; in  the  case  of  (15)  we  thus  have 


o 


Thus,  to  within  higher  order  terms,  we  have 

A/  * 81  (19) 

In  those  cases  when  second  and  higher  order  terms  can  be  neglected, 
we  simply  say  that  the  variation  of  the  functional  is  an  infinitesimal 
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increment  obtained  through  an  infinitesimal  change  (variation)  in 
the  function  upon  which  the  value  of  the  functional  depends.  (However, 
in  the  case  of  substantial  8y,  it  is  of  course  true  that  A I =£  8/!)  Recall 
the  fundamentals  of  differential  calculus  and  you  will  see  a complete 
analogy  between  the  notions  of  the  differential  of  a function  and  the 
variation  of  a functional. 

Exercises 

l 

1.  Find  the  variation  of  the  functionals  I — \ — dx\ 

J 

o 

i 

y2(0)  + ^ (xy  + /2)  dx. 

0 

1 

2.  For  the  functional  I — ^y2  dx  put  y — 2x,  By  = a x2  and  com- 

o 

pare  81  with  AS  for  a = 1,  — 0.1,  0.01. 

12.3  Necessary  condition  of  an  extremum 

Let  a function  y(x)  realize  an  extremum  of  the  functional  /. 
In  other  words,  the  value  of  the  functional  for  the  function  y(x)  is 
greater  (in  the  case  of  a maximum)  and  less  (in  the  case  of  a minimum) 
than  the  values  of  this  functional  for  all  functions  sufficiently  close 
to  y{x).  (The  last  stipulation  has  to  do  with  the  fact  that,  generally,  a 
functional  can  have  several  extrema,  just  as  a function  can  have  sever- 
al extreme  points.)  Then  A I will  be  negative  for  a maximum 
and  positive  for  a minimum  for  all  the  indicated  functions,  that  is, 
in  both  cases  there  is  no  change  of  sign.  But  then  it  follows  from  this 
that 

81  = 0 (20) 

Indeed,  from  (19)  it  follows  that  if  81  0,  then  A/  and  81  have  the 

same  sign,  but  due  to  the  linear  dependence  of  81  on  8y  when  8y 
changes  sign,  81  will  also  change  sign  (see,  for  example,  formula  (18)), 
which  runs  counter  to  the  foregoing. 

Thus,  condition  (20)  is  a necessary  condition  for  an  extremum. 
Incidentally,  it  is  quite  analogous  to  the  necessary  condition  for  the 
extremum  of  a function  that  we  find  in  the  differential  calculus.  If 
the  function  y(x)  attains  an  extremum  in  comparison  with  all  close- 
lying  functions,  then  it  follows  that  8y  can  be  arbitrary  in  the  left 
member  of  (20)  (it  is  not  even  required  that  this  function  be  small 
because,  due  to  the  linearity  of  81,  if  81  = 0 for  small  8y , then  this  is 
also  valid  for  any  8y).  If  an  extremum  is  attained  in  comparison  with 
the  functions  of  a certain  class  (see,  for  instance,  the  conditions  (8) 
and  (13)),  then  8y  must  be  such  that  y + 8y  belongs  to  this  class. 
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The  condition  (20)  is  the  condition  of  stationarity  of  the  functional  / 
lor  a change  in  the  type  of  function  y{x),  upon  which  this  functional 
depends.  In  many  problems  we  may  be  interested  in  all  stationary 
values  and  not  only  in  the  extremal  values  of  the  functional.  For 
instance,  in  the  problems  of  Sec.  12.1  we  say  that  if  for  the  functional 
we  consider  the  potential  energy  of  a continuous  medium,  then  to  every 
stationary  value  there  corresponds  a static  position  of  equilibrium, 
whereas  to  the  minimal  values  of  the  functional  there  correspond  stable 
states  of  equilibrium. 

Let  us  illustrate  the  application  of  condition  (20)  in  an  elementary 
problem  involving  finding  the  extremum  of  a functional  of  the  form 

b 

I = ^F(x,  y)  dx  (21) 


where  for  y we  substitute  any  function  of  x.  To  find  81  we  substitute 
y + 8y  instead  of  y and  expand  the  result  in  a Taylor  series: 

b 

/ + A/  = ^F(x,  y + 8y)  dx 

a 

b b b 

= $ F(x,  y)  dx  + ^ F'y(x,  y)  8y  dx  + ^ dx  + ... 

a a a 

The  variation  of  the  functional,  that  is,  the  linear  portion  of  the 
increment,  is  equal  to 

b 

81  = ^F'y(x,  y)  8y  dx 

a 

From  this,  by  virtue  of  the  arbitrary  nature  of  the  choice  of  8y(x) 
it  follows  that 

F'y(x,y)  = 0 (22) 

Indeed,  if  we  put  8y  — a F'y(x,  y(x))  (a  small),  then  the  last  integral 
is  equal  to 

b 

«\(F'y)2dx 

a 

But  it  must  be  equal  to  zero,  whence  follows  our  assertion.  (If  a con- 
tinuous function  does  not  assume  any  negative  values,  and  the  integral 
of  it  is  zero,  then  the  function  is  identically  zero.) 

We  can  arrive  at  condition  (22)  in  a different  way.  If  the  integral 
(21)  must  have  an  extremal  (say,  minimal)  value,  then  the  integrand 


462 


Calculus  of  variations 


CH.  12 


must  also  be  minimal.  In  other  words,  if  we  add  independent  terms, 
then  every  term  must  be  minimal  if  we  want  the  sum  to  be  minimal. 
Thus,  for  every  * the  value  of  y has  to  be  chosen  from  the  condition 
of  the  function  F being  a minimum,  which  immediately  leads  to  (22). 
Hence,  there  is  nothing  fundamentally  new  here  as  compared  with 
problems  arising  from  the  essentials  of  differential  calculus. 

For  one  thing,  we  arrive  at  functionals  of  the  type  (21)  if  in  the 
reasoning  of  Sec.  12.1  we  assume  that  the  points  of  a string  are  not 
elastically  related  in  any  way  and  are  subjected  solely  to  the  action 
of  an  external  force  and  the  elasticity  of  the  cushion.  It  is  clear  that 
in  that  case  every  point  of  the  string  will  take  up  a location  irrespective 
of  the  positions  of  the  remaining  points,  so  that  there  is  nothing  funda- 
mentally new  here  in  comparison  with  the  statics  of  a point.  In  this 
instance,  the  functional  has  the  form 

i 

u = S[\y'-My]dx 
0 

{which  is  (11)  for  P = 0)  and  the  condition  of  equilibrium  (22)  yields 
ky  -/(*)  =0,  or  y = J f(x) 

The  same  result  is  obtained  from  (12)  for  P = 0.  It  is  interesting  that 
this  solution  is,  generally,  discontinuous  both  due  to  the  possibility  of 
discontinuities  in  the  function  f(x)  (which,  for  example,  is  the  case  if 
an  external  load  is  applied  only  to  a portion  of  the  string)  and  due  to 
the  boundary  conditions  (8).  Of  course,  that  is  as  it  should  be  if  the 
points  of  the  string  are  in  no  way  related  to  one  another.  Actually, 
it  is  hard  to  speak  of  a "string”  in  such  a situation. 

Exercise 

Find  the  functions  y(x ) that  make  the  functional  minimal: 

1 2 

(a)  ( (1  — x)  (y  —2x)2  dx,  (b)  ^ (1  — x)  (y  — 2x )2  dx. 

o o 

12.4  Euler’s  equation 

Take  another  look  at  the  problem  in  Sec.  12. 1 on  the  equilibrium 
of  an  elastic  string  under  the  action  of  an  external  force.  We  found 
that  problem  reduces  to  finding  an  extremum  (or  even  simply  a statio- 
nary value)  of  a functional  in  the  class  of  functions  satisfying  the  boun- 
dary conditions  (8).  Let  us  try  to  find  a solution  with  the  aid  of  con- 
dition (20) ; to  do  this,  we  first  try  to  compute  BU.  Let  y have  an  incre- 
ment By ; then  y'  receives  the  increment  B(y’).  But  it  is  easy  to  verify 
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that  §(/)  = (8y)'  : indeed,  if  8y  = Y(x)  — y(x),  then  8(y')  = 
--  Y'(x)  - /(*)  = [Y(*)  - y(x)Y  = {By)',  whence 

l l 

W = $ [{ (/  + W ~ f{x)  (y  + Sy)] dx  - $ [|  (/)*  - /(*)  3’]  ^ 
0 L 0 

l 

= [[py'*y' + £{*?)*- M*y]dx 

o L 

Dropping  the  second-order  term,  we  get 

i 

SU  = f [P/8/  -f{x)  Sy]  dx 
6 

The  necessary  condition  (20)  yields 

i i 

P ^ yf8yf  dx  — ^f(x)  Sy  dx  = 0 (23) 

0 0 


This  equation  should  hold  for  any  variation  Sy  that  satisfies  the 
conditions 

8y( 0)  = 0,  Sy(/)  = 0 (24) 

which  are  necessary  for  y + Sy  to  satisfy  the  same  conditions  (8)  as  y. 

The  new  element  in  the  condition  (23)  is  that  we  have  Sy'  along 
with  Sy,  and  these  two  quantities  cannot  be  considered  to  be  indepen- 
dent. So  let  us  integrate  the  first  term  by  parts,  applying  (24) : 

i i 

0 = Py'Sy  \l  — P[y"8y  dx  — if(x)  Sy  dx 


= -^[Py"  +f(x)]Sydx 
o 

Taking  advantage  of  the  arbitrariness  of  Sy  and  reasoning  as  in  Sec. 
12.3,  we  get 

Py"  +/(*)  = o 


which  is  equation  (10). 

Thus  the  presence  of  / in  the  expression  of  the  functional  relates 
the  adjacent  values  of  y (cf.  the  example  involving  springs,  which  we 
discussed  in  Sec.  12.1).  In  this  case  it  is  impossible  to  select  a value  of  y 
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independently  of  adjacent  values,  as  was  done  for  the  functional  (21), 
and  instead  of  a finite  equation,  as  in  Sec.  12.3,  we  get  a differential 
equation. 

Now  let  us  derive  an  analogous  differential  equation  for  the 
functional  of  a more  general  type: 

b 

I = ^F{x,  y,  y')dx\y  = y(x),  v'  = ^j  (25) 

a 

where  F is  a given  function,  and  a and  b are  given  limits  of  integration. 
Suppose  we  are  seeking  the  extremum  of  this  functional  for  certain 
given  boundary  conditions: 

y{“)  = y a>  y(b)  = y*  (26) 

Reasoning  as  we  did  in  the  foregoing  problem  and  utilizing  the 
general  formula  (2)  of  Ch.  4,  in  which  the  role  of  x,  dx,  y,  dy  are  played, 
respectively,  by  y,  8y,yff  8y',  we  get  the  variation  of  the  functional  (25) 

b 

a/  = f [Fy{x,  y,  /)  By  + F’y,[x,  y,  y')  8y']  dx 

a 

(This  same  pattern  is  used  for  setting  up  the  variation  and  functionals 
of  a different  form.)  Now,  if  a certain  function  y(x)  that  satisfies  the 
conditions  (26)  realizes  the  extremum  of  the  functional  (25)  compared 
with  all  close-lying  functions  that  satisfy  those  same  conditions,  then 
by  the  criterion  (20)  it  must  be  true  that 

b 

^ [Fy(x,  y,  y')  By  + F’y,(x,  y,  /)  By']  dx  = 0 (27) 

a 

This  equation  should  hold  true  for  any  variation  8y  that  satisfies 
the  relations 

8y(a)  = 0,  8y{b)  = 0 (28) 

which  are  needed  so  that  y + 8y  can  satisfy  the  same  conditions  (26) 
as  y. 

Can  we  satisfy  equation  (27)  by  putting  F'y  = 0,  F'y , = 0 ? We  thus 
obtain  two  equations  for  a single  desired  function  y(x)  (recall  that 
the  expression  F(x>  y,  y')  is  given).  In  the  general  case,  two  equations 
with  one  desired  function  do  not  have  a general  solution.  The  solution 
to  our  problem  does  exist,  however,  for  the  simple  reason  that 
8y  and  8yf  are  independent  quantities.  It  is  possible  to  find 
y(x)  or,  to  put  it  differently  and  better,  an  equation  for  y(x)  such  that 

for  any  8y  and  ^'corresponding  to  it,  one  term,  i.e.  C Fy8y  dx , 
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is  exactly  compensated  for  by  another  term,  ^F'y,8y'  dx,  although  they 

are  not  equal  to  zero  when  taken  separately.  This  is  how  it  is  done. 

Integrating  the  second  term  in  (27)  by  parts  and  applying  the 
relations  (28),  we  get 

b b 

0 = fey  dx  + F’y,8y  * - C — (F'yl)  By  dx 
J a J dx 

a a 

b 

^\F'y-j-{F'y,]\Bydx 

a 

From  this,  by  virtue  of  the  arbitrary  nature  in  the  choice  of  Sy,  it 
follows  that  the  last  square  bracket  is  identically  zero.  True  enough, 
if  we  take  8y  equal  to  this  bracket  and  only  near  the  points  a and  b 
make  it  quickly  diminish  to  zero  (this  is  necessary  so  as  to  satisfy 
the  relations  (28)),  then  the  last  integral  will  be  roughly  equal  to 

b 

^ |e'  — F’y, J dx . But  it  must  be  equal  to  0,  whence  follows  the  as- 

a 

sertion. 

We  have  thus  arrived  at  the  so-called  Euler  equation: 

Fy(*.  y>  /)  - 4-  F'y '(*•  y-  y')  = 0 (29) 

dx 

Note  that  when  differentiating  — here  the  quantity  y is  regarded 

dx 

as  a function  of  x ; in  detail,  by  the  rule  for  differentiating  a composite 
function,  the  Euler  equation  can  be  rewritten  thus : 

F'y(x,  y,  y')  - F'xy)(x,  y,  y')  - F’y'y,{x,  y,  y')  y' 

-F'y’,y,{x,y,y')y"  = 0 (30) 

Here,  the  partial  derivatives  of  F(xy  y , y')  are  taken  disregarding 
the  dependence  of  y and  y*  on  x.  It  is  clear  that  this  is  a differential 
equation  of  the  second  order  and,  for  this  reason  its  general  solution 
.contains  two  arbitrary  constants  that  are  defined  with  the  help  of  the 
boundary  conditions  (26).  The  Euler  equation  generalizes  the  equations 
of  static  equilibrium  (10)  and  (12)  examined  in  Sec.  12.1. 

Now  let  us  examine  another  derivation  of  the  Euler  equation 
(29)  that  does  not  make  use  of  the  formal  procedure  of  integration  by 
parts.  To  do  this,  we  partition  the  interval  a ^ x ^ b by  means  of 
points  x0  — a<  xx  < x2  <...<%„  = b into  subintervals  of  equal  length 
h and  write  down  approximately  the  integrals  of  each  of  the  terms  in 
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the  left-hand  member  of  (27)  in  the  form  of  sums,  the  first  sum  taken 
over  integral  points  of  division  and  the  second  sum  over  half-integral 
points  of  division  (Secs.  2.1,  2.2).  If  we  replace  the  derivative  by  the 
difference  quotient,  then  instead  of  (27)  we  get  the  condition 

£ (?,)&>  + £ (*»*+■/,  S>'*V = o (3i) 

k k h 

At  first  glance,  it  might  appear  that  the  coefficient  of  8yk  in  the  second 
sum  is  appreciably  greater  than  the  corresponding  coefficient  in  the 
first  sum,  since  for  small  h , | (i^O^+Va  | lhp\(F'y)k  |.  But  we  have  to  take 
into  account  that  in  the  second  sum  8yk  participates  in  two  adjacent 
terms:  with  tlt  and  with  (Fy,){k-i)+'!2-  F°r  this  reason,  the 

total  coefficient  of  $yk  in  the  left-hand  member  of  (31)  is 

(F’y)k  - 1 [{F’y, )*+,/.-  (F'y> )*-./,]  * (F'Jk  - [A  (/>)]*  (32) 

Thus,  h cancels  out  and  both  terms  in  the  right  member  of  (32)  are  of 
the  same  order. 

Because  Syk  is  arbitrary,  from  condition  (31)  we  find  that  this 
coefficient  must  be  zero,  i.e.,  we  arrive  at  (29). 

To  illustrate,  we  take  the  problem  of  Sec.  12.2  on  the  curve  of 
fastest  descent.  In  this  case  (see  (14))  it  is  necessary  to  put 


l/l  + y'2 
V2g{ya  - y) 


(33) 


Since  x does  not  appear  directly  in  the  right  member,  it  follows  that 
the  second  term  is  absent  on  the  left  side  of  (30),  and  after  multiplying 
both  sides  of  (30)  by  y9  it  can  be  rewritten  as  [F  — F'y , y')f  = 0.  Then, 
integrating,  we  get 

F - F’y,yf  = Cx  (34) 

which  for  the  concrete  example  of  (33)  yields 

g±g  _ y'  y = c, 

i 2g(ya  -y)  ]/l  + y'2  V2g(ya  - y) 

or 


Vi  + y'2  \ig(ya-y ) 

In  order  to  integrate  this  equation  we  introduce  an  artificial  parameter  t 
by  the  formula 

ya  — y = r(l  — cos  t)  where  r = 1 j4g  Cl 


Sec.  12.4 


Euler’s  equation 


467 


Fig.  166 


After  the  appropriate  manipulations  we  get 


, 1 = Ci V 2g(ya  -y)=]f- 

Y i+y'2  Y 


— COS  t 


dy 

dx 


1 t . ,'2  _ 

J ~r  j 

sin  t 


1 — cos  t 


1 — cos  t 5 
dx  ==  — f- 


V “ 


1 4-  cos  t 


sin2  t 


1 — cos  t 
1 — cos  t 


sin  t 


(1  — cos  t )2 
dy  = it  r(  1 — cos  t)  dt ; 


# = ± — sin  t)  + C2 

This  expression  together  with  the  preceding  expression  for  y give  the 
parametric  equations  for  what  is  called  a cycloid  (see,  for  example, 
HM,  Sec.  1.8),  which  is  a curve  described  by  a point  of  a circle  of  radius  r 
rolling  down  without  sliding  along  a horizontal  straight  line  drawn 
through  A . Thus,  the  curve  of  quickest  descent  is  a cycloid  with  cusp 
at  the  point  A;  here,  the  radius  r must  be  chosen  so  that  the  first  arch 
of  the  cycloid  passes  through  the  terminal  point.  Fig.  166  shows  several 
such  arches,  of  which  the  fourth  passes  through  the  terminal  point. 
It  is  interesting  to  note  that  this  arch  partially  passes  below  the  termi- 
nal point,  which  is  something  that  might  not  have  been  foreseen.  In- 
cidentally, a glance  at  the  answer  makes  this  clear  at  once:  in  order 
to  cover  a large  horizontal  path  it  is  best  to  fall  lower  in  order  to  gain 
velocity  and  then,  at  the  end  of  the  path,  rise  to  the  point  of  destina- 
tion. The  point  B must  of  course  be  lower  than  A , otherwise  there  will 
be  no  solution  at  all. 

Let  us  return  to  the  general  functional  (25)  for  the  boundary 
conditions  (26).  Suppose,  having  fixed  these  conditions,  we  obtain 
a certain  solution  y(x)  to  the  extremum  problem.  Now  suppose  that 
these  conditions  can  change.  Then  the  extremal  value  of  the  functional  I 
will  depend  on  these  conditions: 

! = !{<*,  b,  ym,  yb) 


(35) 
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Here  we  substitute  into  I only  those  functions  that  make  / 
extremal  for  the  fixed  boundary  conditions.  If  this  were  not  done,  then  I 
would  depend  not  only  on  the  boundary  conditions  but  also  on  an 
arbitrary  function  y{x). 

It  may  become  necessary  to  study  the  functional  relationship 
(35)  of  the  value  of  I under  a change  of  the  boundary  conditions  y(a)  = 
= ya,  y{b)  = yb.  Since  the  function  (35)  already  depends  on  a finite 
number  of  independent  variables,  this  problem  is  solved  with  the  tools 
of  ordinary  analysis,  by  calculating  derivatives.  As  an  example,  here 
" dl 

is  how  the  derivative is  calculated.  If  a , b , yb  are  fixed  and  ya 

fya 

varies,  then  the  solution  y(x)  of  the  extremum  problem  will  change  and 
will  receive  the  increment  8y(x).  Then,  to  within  higher  order  infini- 
tesimals. 


O o 

AI  = f (F'y8y  + F'y,  8y')  dx  = F;,§/  + (j  |>;  - 


dx 


But  since  y(x)  was  a solution  of  the  extremum  problem,  it  satisfies  the 
Euler  equation  and  so  the  last  integral  vanishes.  Besides,  3y(6)  = 0 
(since  yb  remains  unchanged)  and  we  get 

AI=~(F'y,)a8y(a),  JL  = - (F'y)m 

oya 

The  other  derivatives  of  the  function  (35)  are  calculated  in  similar 
fashion. 


Exercises 

l 

1.  Solve  the  problems:  (a)  I = min^  (y2  + y'2)  dx, 

y{ o)  = o,  y(i)  = i; 

i 

(b)  I = min  ^yy'2  dx,  y(0)  ~ p > 0,  y(l)  = q > 0. 

0 

dl 

2.  In  (b)  find  — directly  and  via  the  formula  derived  in  the  text. 

dp 

3.  Using  the  calculus  of  variations,  derive  the  equation  of  a straight 
line  as  the  shortest  distance  between  two  specified  points. 

12.5  Does  a solution  always  exist? 

Condition  (20)  is  only  a necessary  condition  for  an  extremum. 
The  sufficient  conditions  are  rather  involved,  but  in  most  practical 
problems  they  are  not  required.  For  example,  in  the  problem  examined 
in  Sec.  12.4  we  did  not  verify  that  for  the  constructed  arch  of  the  cycloid 
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it  is  precisely  a minimal  time  of  descent  that  is  accomplished.  But 
from  physical  reasoning  it  is  clear  that  some  such  solution  to  the  pro- 
blem  of  the  minimum  for  ya  > yb  exists  and  only  that  one  has  to  be 
found.  And  since  using  the  necessary  condition  (Euler's  equation) 
}delded  only  one  solution,  that  is  the  desired  solution.  Of  course,  if  we 
did  not  give  thought  to  the  problem,  we  could  pose  the  question  of 
seeking  the  curve  of  maximum  time  of  descent  and  we  could  arrive 
at  the  same  Euler  equation  and  the  same  solution.  But  that  answer 
would  be  erroneous,  for  we  now  see  that  the  constructed  solution  (which 
is  the  only  solution)  produces  precisely  the  minimum  time  of  descent. 
As  to  the  corresponding  maximum  problem,  it  has  no  solution  at  all 
and  it  is  easy  to  construct  curves  with  arbitrarily  large  times  of 
descent. 

Thus,  an  extremum  problem  may  not  have  a solution;  in  certain 
cases  this  is  more  or  less  clear  at  once  from  the  very  statement  of  the 
problem  (as  in  the  foregoing  example  when  considering  the  maximum 
problem),  whereas  in  other  cases  it  follows  from  the  results  of  compu- 
tations. An  example  will  serve  to  illustrate  this  point. 

Let  it  be  required  to  find  the  shape  of  a film  stretched  over  two 
equal  rings  located  perpendicular  to  the  lines  joining  their  centres. 
This  is  the  shape  of  a soap  film  stretched  over  the  rings  (Fig.  167, 
where  the  heavy  lines  represent  the  axial  cross-section  of  the  film). 
Since  it  is  clear  from  considerations  of  symmetry  that  the  desired  sur- 
face is  a surface  of  revolution  and  the  area  of  the  surface  of  revolution, 
as  we  know  (see  HN,  Sec.  4.7),  is 
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the  matter  reduces  to  finding  the  function  y(x)  that  makes  the  integral 
(36)  a minimum  under  the  boundary  conditions 


y(—  a)  — r,  y(a)  = r 


(37) 


Here  too  the  * does  not  enter  directly  under  the  integral  sign  so  that 
we  can  take  advantage  of  the  intermediate  integral  (34),  which 
gives  us 


2ny]fl  + — ■ 


2r:yy' 

\/TTV2 


y’  = Cj,  i.e. 


VTT. 


£. 

2 71 


Denoting  the  right-hand  member  by  — and  carrying  out  certain  mani- 

k 


pulations,  we  get 
y-  _ l 

l + y'2  k2  ’ 


L 


dy 

]fk2yi  — i 


±dx>  \y^-±{x  + C) 


(38) 


The  last  integral  for  k = 1 is  given  in  HM,  p.  476  (No.  31)  and 

is  equal  to  In  (y  + | /y2  — 1).  Let  us  try  to  compute  the  derivative 

of  the  function  In  ( ky  + ]j (ky)2  — 1) ; it  is  equal  to  kf\k2y2  — 1 (check 

this!).  And  so  from  (38)  we  get  — In  (ky  -f  )/k2y2  — 1)  = ± (x  + C), 

k 

whence,  after  some  simple  manipulations  which  we  leave  to  the 

reader,  y = *fe(A+C)  + e k(*+C)  ^ essential  here). 

2k 

For  the  curve  to  be  symmetric  about  x = 0,  it  must  be  true  that 
C = 0,  and  so,  finally, 

ekx  -j_  e~kx 


This  curve  is  called  the  catenary,  which  is  the  shape  of  a chain 
suspended  at  the  ends  (see  Exercise  2 in  Sec.  12.8).  To  summarize, 
then : the  desired  surface  of  a soap  film  results  from  rotating  a catenary. 
For  the  boundary  conditions  (37)  to  be  satisfied,  it  is  required  that 


eka- j-  g—ka 

2k 


= r. 


eak  -L  g—ak 

or  = rk 

2 


(39) 


From  this  we  can  find  the  as  yet  unknown  value  of  the  parameter  k . 
Let  us  try  to  solve  this  problem  graphically.  For  given  a and  r,  depict 
the  graphs  of  variation  of  the  left  and  right  sides  of  equation  (39) 
(see  Fig.  168)  and  find  the  point  of  intersection  of  the  two  curves.  We 
are  surprised  to  find  that  instead  of  the  expected  one  solution  there  are 
two  (for  relatively  small  a,  i.e.  when  the  rings  are  close  together) 
or  none  (when  the  rings  are  far  apart).  What  shape  does  the  film  take 
after  all? 
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Fig.  169 


First  let  a be  comparatively  small,  which  means  there  are  two  solu- 
tions as  shown  by  the  heavy  lines  in  Fig.  169  (the  rings  are  indicated 
here  by  dashed  lines).  If  we  imagine  the  other  shapes  of  the  film  shown 
in  Fig.  169  by  the  light  lines,  then  the  areas  of  the  corresponding  sur- 
faces of  revolution  will  have  the  values  represented  graphically  in 
Fig.  170.  We  see  that  of  the  two  solutions,  the  minimum  area  is  real- 
ized for  the  upper  one  and  the  maximum  area  for  the  lower  one.  This 
is  why  the  upper  solution  yields  a stable  form  of  equilibrium  of  the 
film,  and  the  lower  solution  gives  an  unstable  form. 
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Fig.  170 


Fig.  171 


Now  if  a is  increased  (for  a given  r),  that  is,  if  we  move  the  rings 
away  from  each  other,  the  extremum  points  a and  p come  together  and 
for  sufficiently  large  a the  graph  of  the  areas  assumes  the  form  shown 
in  Fig.  171.  Thus,  in  this  case  the  film  tends  to  reduce  its  area  and  con- 
tracts to  the  line  of  centres,  separates,  and  assumes  the  form  of  two 
separate  circles,  each  of  which  spans  its  ring.  (Incidentally,  the  film 
behaves  in  the  same  way  for  small  a if  it  begins  to  deform  and  has  too 
thin  a waist.)  Thus,  in  this  case  there  will  be  no  single  surface.  The 
shape  of  the  graphs  in  Figs.  170  and  171  is  confirmed  by  the  fact  that 
for  y = 0 we  will  have  S = 2nr2  and  for  y = r it  will  be  true  that 
S — i lira,  so  that  for  small  a the  last  value  is  smaller  than  S|y=0  and  for 
large  a it  is  greater.  For  large  a , the  shape  consisting  of  two  separate 
circles  may  be  regarded  as  a generalized  solution  of  our  problem. 
This  solution  is  in  the  nature  of  a “terminal”  or  “cuspidal”  minimum, 
for  which  the  Euler  equation  does  not  hold  (cf.  Sec.  12.9). 

The  calculations  have  shown  that  we  arrive  at  conclusions  which 
could  not  have  been  foreseen. 
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It  is  easy  to  calculate  what  value  of  a/r  is  critical  in  the  sense  that 
equilibrium  is  possible  for  smaller  values  and  is  not  possible  for  greater 
values.  This  occurs  when  both  graphs  in  Fig.  168  touch  each  other,  that 
is  to  say,  the  derivatives  of  both  functions  with  respect  to  k are 
equal  to 

aeak  _ ae-ak 

= r 

2 

Combining  this  equation  with  (39)  and  setting  — ~ X (this  is  the 

r 

critical  value),  we  get 


k y 1 + X2  __  V 1 + X2  . /l+X*  - e~Vl+Xi  _ 

n ?X  “ a * 2 X 

And  so  X is  found  from  the  last  equation.  A rough  numerical  calcula- 
tion gives  the  approximate  value  of  X = 0.7. 

It  is  interesting  to  note  that  if  the  upper  solution  in  Fig.  169  makes 
for  an  absolute  minimum  of  the  area  of  the  surface  of  revolution,  then 
the  lower  solution  maximizes  only  in  the  represented  family  of  surfaces. 
If  we  go  outside  the  limits  of  this  family,  the  lower  solution  will  be 
stationary  since  it  satisfies  the  Euler  equation,  but  by  no  means  will 
it  be  maximal;  it  will  be  in  the  nature  of  a minimax  (cf.  Sec.  4.6). 
If  we  choose  any  two  sufficiently  close  points  in  the  lower  solution 
and  regard  them  as  boundary  conditions,  then  the  arc  between  them 
is  stable,  which  means  it  realizes  a solution  of  the  minimum  problem. 
This  means  that  for  any  change  of  the  lower  solution  over  a sufficiently 
small  interval  the  area  will  increase.  The  maximum  problem  for  a 
surface  of  revolution  does  not  have  a solution.  (This  is  also  clear  from 
the  possibility  of  crimping  or  corrugating  any  surface.) 

The  idea  that  a small  section  of  a stationary  solution  is  not 
only  stationary  but  also  extremal  turns  out  to  be  useful  in  many 
problems. 


Exercise 

a 

Consider  the  minimum  problem  of  the  functional  I (y'2  — y 2)  dx 

o 

(a  > 0)  for  the  boundary  condi tionsy(O)  = y{a)  = 0.  These  condi- 
tions are  satisfied,  for  instance,  by  the  functions  y = Cx  (a  — x) 

and  y = C sin  — • What  conclusion  can  be  drawn  from 

a 

this  concerning  the  existence  of  a solution  to  the  problem? 
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12.6  Variants  of  the  basic  problem 

There  are  large  numbers  of  other  types  of  variational  problems  that 
are  discussed  in  courses  of  the  calculus  of  variations.  Some  of  these  pro- 
blems are  investigated  in  the  manner  that  we  considered  (in  Sec.  12.4) 
the  extreme-value  problem  of  the  functional  (25)  under  the  condition 
(26).  For  example,  derivatives  of  the  desired  function  of  order  higher 
than  second  can  appear  under  the  integral  sign ; in  that  case  the  Euler 
equation  is  of  a higher  order  than  second  (namely,  twice  as  high  as 
the  highest  order  of  the  derivatives  that  enter  into  the  functional). 
Several  unknown  functions  can  appear  under  the  integral  sign; 
then  they  are  sought  via  a system  of  differential  Euler  equations,  the 
number  of  equations  in  the  system  being  equal  to  the  number  of  de- 
sired functions,  for  it  is  necessary  to  equate  to  zero  the  variations  of 
the  functional  obtained  by  varying  each  of  the  functions.  Such,  for 
example,  is  the  case  of  seeking  a curve  in  space,  for  such  a curve  is 
determined  by  two  functions,  say,  y(x)  and  z(x). 

Now  let  us  consider  a variational  problem  for  a function  of  several 
variables  (for  the  sake  of  definiteness,  we  take  two  independent  va- 
riables). Suppose  we  are  seeking  the  function  z(x,  y ) that  gives  a maxi- 
mum to  the  integral 

§F(X>  y,  Z,  z'x,  z'y)  dx  dy  (40) 

(°) 

where  (a)  is  a certain  specified  region  with  boundary  (. L ) with  the  boun- 
dary condition 

z\m  = ? (given)  (41) 

Arguing  in  the  manner  of  Sec.  12.4  leads  to  the  Euler  equation,  which 
in  this  case  is  of  the  form 

F'z-  — Fz>  - — F'z>  = 0 (42) 

dx  * dy  V V ' 

d d 

Bear  in  mind  that  when  computing  — and  — , the  z is  regarded 

dx  dy 

as  a function  of  the  variables  x and  y.  Thus,  to  obtain  a solution  we 
get  a second-order  partial  differential  equation  with  the  boundary  con- 
dition (41).  The  solution  of  such  equations  is  beyond  the  scope  of  this 
text,  but  one  physically  important  example  will  be  examined  (another 
one  is  given  in  Sec.  12.12). 

We  seek  the  equation  for  the  equilibrium  form  of  a membrane 
stretched  over  a rigid  frame  (contour).  We  assume  the  membrane  to  be 
homogeneous  (the  same  at  all  points),  isotropic  (the  same  in  all  directions) 
and  stretched  with  a force  T per  unit  length.  The  potential  energy  of 
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I he  membrane  that  results  from  stretching  it  over  the  contour  is  due 
In  an  increase  in  its  area  as  compared  with  the  horizontal  position. 
In  integral  calculus,  proof  is  given  that  the  area  Q of  the  surface  describ- 
ed by  the  equation  z = z(x,  y)  is 

g) ITTWTW^xdy 

(a) 

so  that  the  increase  AQ  in  the  area  of  the  membrane  is  equal  to 

gyi  + W+Wdx  dy-c=§  [Kl  + (4)2  + W - 1]  dxdy 
(o)  (°) 

Regarding  the  deflection  as  small  (this  requires  that  the  given  contour 
of  the  membrane  deflect  only  slightly  from  the  plane  z = 0)  and  the 
quantities  zx  and  zy  as  small  too,  we  expand  the  integrand  in  a series 
and  discard  the  higher  order  terms : 

A Q = “ L(z'x)2  + izy)2]  + higher  order  termsj  — l^dx  dy 

(o) 

= + (4)2] dx  dy 

(o) 

Assuming  that  the  tension  of  the  membrane  remains  unchanged  in  the 
process  of  stretching,  we  find  the  work  done  in  this  process  to  be 

A A =£§[&)* +&)*]dxdy  (43) 

(o) 

Hence  that  will  also  be  the  accumulated  potential  energy. 

However,  from  physics  it  is  known  that,  from  among  all  possible 
forms,  a membrane  chooses  that  form  of  equilibrium  for  which  the 
potential  energy  is  a minimum.  Hence  we  have  the  problem  of  mini- 
mizing the  integral  (43).  By  virtue  of  the  general  Euler  equation  (42) 

we  get  0 — 2zfx\ 1—2 4=0  which,  after  cancelling,  yields 

dx[2  ) dy\2  } 

+ zyy  = 0 (44) 

Thus,  the  form  of  equilibrium  of  the  membrane  is  described  by  the 
function  z — z{x>  y)  that  satisfies  the  Laplace  equation  (see  Sec.  5.7). 
To  find  this  form  in  a specific  instance,  it  is  necessary  to  find  the  solu- 
tion to  equation  (44)  for  the  boundary  condition  (41). 
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Exercises 

b 

1.  Derive  the  Euler  equation  for  the  functional  I = ^F(x,  y,  y ' , y ")  d x 

a 

under  the  boundary  conditions  y(a)  = ya,  y' (a)  = y'a,  y(b)  = 

= y».  y’ip)  = yl 

2.  Write  out  equation  (-42)  in  full,  like  (30). 

12.7  Conditional  extremum  for  a finite  number  of  degrees  of  freedom 

Let  us  go  back  to  the  extremum  problems  for  a system  with  a 
finite  number  of  degrees  of  freedom.  In  the  problems  considered  in  Sec. 
4.6  the  independent  variables  were  not  connected  by  any  relations; 
such  extrema  are  called  absolute.  There  are  also  problems  involving 
conditional  extrema  in  which  the  independent  variables  are  related 
by  definite  equations.  Let  us  begin  with  functions  of  two  variables. 

Suppose  we  are  seeking  the  maximum  or  minimum  of  the  function 
z = f(x,  y)  under  the  condition  that  the  variables  x and  y are  not  in- 
dependent but  are  connected  by  the  relation 

F(x,  y)  ~ h (45) 

This  means  that  the  values  of  the  function  / are  considered  and 
compared  only  for  points  (in  the  plane  of  the  arguments)  lying  on 
the  line  given  by  the  equation  (45).  For  example,  Fig.  172  depicts 
level  lines  of  a certain  function  f{xt  y)  having  an  absolute  maximum 
at  the  point  K ; the  heavy  line  (L)  given  by  the  equation  (45)  is 
also  shown.  Going  along  (L),  we  come  to  the  level  line  with  the 
highest  label  at  point  A and  then  pass  immediately  into  a region 
with  lower  labels,  a region  of  lower  altitudes.  This  means  that  there 
is  a conditional  maximum  of  the  function / at  the  point  A,  but  it 
is  not  an  absolute  maximum  because  if  we  move  from  (L)  towards 
K,  we  could  find  higher  values  of  / near  A.  In  similar  fashion,  wre 
can  check  that  at  the  point  B we  have  a conditional  minimum,  and 
at  the  point  C another  conditional  maximum ; in  other  words,  there 
are  three  conditional  extrema  here.  Thus,  an  absolute  maximum  is 
like  a mountain  peak,  whereas  a conditional  maximum  is  the  highest 
point  of  a given  mountain  path  (the  projection  of  this  path  on  the 
*y-plane  has  the  equation  (45)). 

If  it  is  possible  to  express  y in  terms  of  x from  the  constraint 
equation  (45),  then  this  result  can  be  substituted  into  the  expres- 
sion for  z, 

2 = /[*,  y{x)]  (46) 

to  obtain  z as  a function  of  one  independent  variable.  Since  there 
is  no  longer  any  condition  (it  has  been  taken  into  account  by  the 
substitution  y = y(x))t  it  follows  that  the  problem  of  seeking  the 
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Fig.  172 


extremum  of  z becomes  an  absolute-extremum  problem.  A similar 
result  is  obtained  if  equation  (45)  can  be  solved  for  x or  if  the  equa- 
tion of  the  line  (45)  can  be  represented  in  parametric  form. 

But  such  a solution  of  (45)  is  not  always  possible  and  advisable. 
Then  we  can  reason  as  follows.  The  constraint  equation  (45)  defines 
fundamentally  a certain  relationship  y = y(x),  although  it  is  implicit 
and  not  known.  Thus,  z is  a composite  function  (46)  of  the  indepen- 
dent variable  x,  and  the  necessary  condition  for  an  extremum  yields, 
by  the  formula  of  the  derivative  of  a composite  function, 

^=f:+f'yf=0  (47) 

dx  ax 


Here,  dyjdx  signifies  the  derivative  of  the  implicit  function  y(x) 
defined  from  the  condition  (45).  By  the  rules  of  Sec.  4.3,  we  find 

that  F'x  + Fy  — = 0,  or  — = — F’xjFy.  Substituting  this  expres- 

dx  dx 

sion  into  (47),  we  get,  at  the  point  of  the  conditional  extremum, 
f Ekf  - 0 that  is,  - Ek  = -Ik  or  £ = £ 

Fy  Fy  fy  F'x  F'y 

(The  middle  equation  signifies,  by  virtue  of  Sec.  4.3,  that  at  the 
point  of  the  conditional  extremum  the  curve  (45)  touches  the  level 
line  of  the  function  /,  cf.  Fig.  172.)  Denote  the  last  relation  at  the 
point  under  consideration  by  X.  Then  at  the  point  of  the  conditional 
extremum  we  have 


A 

K 


= X 


(48) 


or 


fx  - \f'x  = 0,  /;-  xf;  = o 


(49) 
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f'(x,  y;  X)  =/(*,  y)  - )F(x,  y)  (50) 


where  X is  an  unknown  parameter  called  a Lagrange  multiplier . 
Then  (49)  can  be  written  thus: 

r:  = o,  /;'  = o (5i) 

Thus,  we  have  the  same  equations  as  in  the  case  of  the  absolute 
extremum  (see  (28)  of  Ch.  4) ; however,  they  are  set  up  not  for  the  func- 
tion /itself,  but  rather  for  the  changed  function  f*  defined  by  formula 
(50).  Equations  (51)  together  with  the  constraint  equation  (45) 
form  a system  of  three  equations  in  three  unknowns  #,  y,  X.  These 
equations  yield  the  conditional  extremum  points. 

The  Lagrange  multiplier  X has  a simple  meaning.  To  determine 
it,  denote  the  coordinates  of  the  point  of  conditional  extremum 
and  the  extremal  value  itself  by  x,  y,  and  z,  respectively.  Up  to 
now  we  considered  h to  be  fixed,  but  if  we  vary  h,  then  these  three 
quantities  will  depend  on  h.  Let  us  determine  at  what  rate  the  extre- 
mal value  of  z will  change  as  h is  varied.  Since  z(h)  = /(5?(A),  y{h)), 
it  follows  that 


dz  rt  doo  * /»/  dy 

dh  X dh  V dh 


(52) 


On  the  other  hand,  by  (45), 


F'x  — + F'  H = 1 

dh  y dh 


(53) 


From  (52),  (48)  and  (53)  we  get 


— = \F'X—  + XF’y^l 

dh  dh  y dh 


k[f'x—  + F'y^-\  = 

[ dh  V dh  J 


Thus,  the  multiplier  X is  equal  to  the  rate  of  change  of  the  extremal 
value  as  the  parameter  h varies  in  the  condition.  The  Lagrange 
method  is  remarkable  in  that  the  derivative  dzjdh  can  be  found 
without  writing  out  the  function  z(h),  which  may  be  very  intricate, 
in  explicit  form. 

The  statement  of  a conditional-extremum  problem  is  typical 
of  problems  in  economics : we  seek  the  maximum  quantity  of  goods  z 
for  specified  expenditures  h;  we  know  the  dependence  of  z and  h 
on  the  type  of  action  described  by  the  variables  xy  y.  For  optimal 
action,  to  every  h there  corresponds  one  definite  l.  The  derivative 
dhjdz  is  the  cost  price  of  the  (surplus)  product  in  an  ideally  adjusted 
economic  system  where  the  outlay  h has  already  been  made  and  z 
has  been  produced,  and  it  is  now  required  to  increase  the  production. 

Investigation  of  a conditional  extremum  is  carried  out  in  similar 
fashion  for  functions  of  any  number  of  variables  and  for  any  number 
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of  constraints.  For  example,  if  we  seek  the  extremum  of  the  function 
f(x,  y,  z,  u,  v ) under  the  conditions 


F^x,  y,  z,  u,  v)  = 0,  Fz(x,  y,  z , u,  v)  = 0, 


F3(x,  y,  z,  u,  v)  = 0 


(54) 


then  we  have  to  proceed  as  if  we  were  seeking  the  absolute  extremum 
of  the  function  f*  = / — — X2F2  — X3F3,  where  X1,  X2,  X3  are 

unknown  Lagrange  multipliers.  The  necessary  condition  of  an  extre- 
mum for  /*  yields  /*'  = 0,/*'  = 0,/’z'  = 0,/*'  = 0,  /„"  = 0,  which, 
together  with  (54),  produces  5 + 3 equations  in  5 + 3 unknowns 
x,  y,  z,  u,  v,  X1(  X2,  X3. 


Exercise 


Find  the  conditional  extremum  of  the  function  u(x,  y,  z ) ~ 
= x2  — y2  + z2  — 2x  (a)  under  the  condition  * + 2 v — z = 3 ; 
(b)  under  the  conditions  x-^y  — z~0,xJr2y=  1. 


12.8  Conditional  extremum  in  the  calculus  of  variations 


Conditional-extremum  problems  in  the  calculus  of  variations 
are  posed  and  solved  in  a manner  similar  to  that  done  in  Sec.  12.7 
for  problems  with  a finite  number  of  degrees  of  freedom.  To  illus- 
trate, let  us  consider  the  problem  of  finding  the  extremum  of  the 
functional  (25)  under  the  boundary  conditions  (26)  and  the 
accessory  integral  condition 

b 

^G(X>  y>y*)  = & (55) 

a 

where  G is  a given  function  and  K is  a given  number.  If  we  partition 
the  interval  of  integration  into  a large  number  of  small  subintervals 
and  replace  the  integrals  (25)  and  (55)  by  sums  depending  on  the 
values  of  yt  of  the  desired  function  at  the  division  points  (that  is, 
if  we  perform  a transition  just  the  reverse  of  that  carried  out  in 
Sec.  12.1),  then  we  arrive  at  a conditional-extremum  problem  of 
a function  of  a finite  number  of  variables  yf.  By  virtue  of  the  results 
of  Sec.  12.7,  this  problem  is  equivalent  to  the  absolute-extremum 
problem  of  the  expression 

b b b 

y>  y')  dx  — X^G(#,  y,  yf)  dx  = ^ (F  — XG)  dx 

a a a 

(here,  we  passed  from  sums  to  integrals),  where  X is  a constant  La- 
grange multiplier  not  known  beforehand.  Thus,  instead  of  (29) 
we  have  to  write  a similar  Euler  equation  for  the  function  F*  = F — 
— XG.  After  solving  this  equation  we  find  the  two  constants  of  inte- 
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gration  and  the  constant  X from  the  three  conditions  (26)  and  (55), 
and,  by  virtue  of  Sec.  12.7,  the  quantity  X is  of  great  importance 
since  it  is  equal  to 


X = 


<?/extrem 

dK 


(56) 


This  equation  makes  it  possible  to  determine  the  character  of  the 
dependence  of  the  extremal  (generally,  stationary)  value  /extrem 
of  the  functional  if  the  parameter  K in  (55)  can  vary  (this  value 
f extrem  then  of  course  depends  on  A"). 

By  way  of  an  example,  let  us  consider  the  famous  Dido  problem . 
As  the  story  goes,  there  lived  a Princess  Dido  of  Phoenicia,  who, 
pursued  the  ruler  of  a neighbouring  country,  fled  to  North  Africa 
where  she  bargained  with  a local  chieftain  for  some  land,  agreeing 
to  pay  a fixed  sum  for  as  much  land  as  could  be  encompassed  by 
a bull's  hide.  When  her  request  was  granted,  she  proceeded  to  cut 
the  hide  into  very  thin  strips,  tied  them  end  to  end,  and  then  to 
the  amazement  of  the  onlookers  she  enclosed  an  enormous  portion 
of  land.  Dido  then  founded  the  celebrated  ancient  city  of  Carthage. 

Ancient  scholars  already  posed  the  problem  of  how  Dido  should 
have  arranged  her  string  in  order  to  encompass  a maximum  area, 
in  other  words,  how  to  choose  from  all  curves  of  given  length  that 
one  which  embraces  the  largest  area.  It  turned  out  that  the  solu- 
tion of  this  isoperimetric  problem  is  a circle  (if  the  curve  is  closed) 
or  an  arc  of  a circle  (if  the  curve  is  not  closed,  say,  if  Dido  chose 
land  along  the  sea  so  that  the  ends  of  the  string  lay  on  the  shore; 
see  Fig.  173).  However,  it  is  not  at  all  easy  to  prove  this  in  a rigorous 
manner. 

Let  us  see  how  Dido's  problem  can  be  formulated  analyti- 
cally. For  the  sake  of  simplicity,  assume  the  shoreline  to  be  straight 
and  put  the  coordinate  axes  as  in  Fig.  173.  We  consider  a variant 
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in  which  the  position  of  the  endpoints  of  the  string  is  fixed.  Then 
the  shape  of  the  string  is  given  by  the  equation  y — y(x) ; the  func- 
tion y(x)  is  not  known  beforehand  but  it  must  be  such  as  to  satisfy 
the  conditions 

y(a)  = 0,  y(b)  - 0 (57) 

Besides,  the  given  length  of  the  curve  is 

b 

J Kl  + y'2dx  --  L (58) 


(the  formula  for  the  length  of  a curve  is  derived  in  integral  calculus ; 
see  HM,  Sec.  4.5).  And  it  is  required  that  the  area,  i.e. 

6 

5 =^y  dx  (59) 

a 


be  a maximum.  We  thus  have  the  following  problem:  from  among 
all  functions  y = y(x)  that  satisfy  the  conditions  (57)  and  (58)  choose 
the  one  for  which  the  integral  (59)  has  the  largest  possible  value. 

The  problem  of  Dido  was  first  solved  by  geometrical  methods 
involving  ingenious  but  not  altogether  obvious  reasoning.  Using 
the  apparatus  of  the  calculus  of  variations,  we  can  solve  it  by  a 
standard  procedure.  By  virtue  of  what  was  said  at  the  beginning 
of  this  section,  it  is  required  merely  to  solve  the  Euler  equation 
for  the  function 

F*  = y — XV 1 + y'2  (60) 


Using  the  intermediate  integral  (34),  we  get 

y - ^T  + y5  + x j==ry'  = c, 


Transforming,  we  obtain 
x 


v — 


n+y* 


= cf  x*  = (i  +y*)(c1-^)»;  y2  = 


x2 


(C.-y)2 


-i; 


dy  = ± V'X2  - (y  - c,)2 
dx  y — c i 


(y  - C,)  dy 


± A2  - (y  - C,)2 


= dx; 


(61) 


± ~(y~  Ctf  = x + c2;  (x  + C2)2  + (y-  C,)2  = X2 


We  get  the  equation  of  a circle,  hence  the  solution  of  Dido's  pro- 
blem is  an  arc  of  a circle,  as  we  have  already  pointed  out.  Since 
the  length  L of  the  arc  of  the  circle  and  its  endpoints  are  given, 
it  is  easy  to  find  the  centre  of  the  circle  and  to  represent  the  arc 
itself. 


31-1634 
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Fig.  174 


Dido's  problem  has  variants.  If  the  length  L is  given,  but  the 
endpoints  a,  b of  the  arc  are  not  given,  then  of  all  arcs  of  the  given 
length  it  is  required  to  find  the  one  that  bounds  maximum  area. 
It  can  be  verified  that  the  desired  arc  of  the  circle  has  to  approach 
the  shoreline  perpendicularly ; in  particular,  if  the  shoreline  is  straight, 
then  we  have  to  take  a semicircle.  If  there  is  no  sea  and  the  line 
has  to  be  closed,  then,  as  pointed  out  above,  we  get  a circle.  For 
the  last  case,  let  us  verify  the  relation  (56).  Here,  /extrem  = kR2, 
K = L = 2nR  (R  is  the  radius  of  the  circle),  whence 

dl_  _ d{-n:R2)  _ R 
dK  ~ d(2nR)  ~~ 

This  corresponds  precisely  to  formula  (61),  from  which  it  is  evident 
that  the  radius  of  the  circle  is  equal  to  [X|  (below  we  will  see 
that  X — Rt  so  that  the  signs  in  (56)  are  also  in  agreement). 

The  problem  of  Dido  has  a simple  physical  explanation.  Imagine 
a rigid  horizontal  frame  over  which  is  stretched  a soap  film  and 
let  a thin  string  rest  on  the  film  with  ends  attached  to  the  frame 
(Fig.  174a)  at  the  points  a and  b.  If  the  film  is  punctured  with  a 
needle  inside  the  encompassed  area,  then,  due  to  surface  tension, 
the  film  will  stretch  (see  Fig.  174?;)  so  that  the  area  becomes  the 
smallest  possible,  which  means  the  film-free  area  becomes  the  largest 
possible.  This  is  precisely  the  problem  of  Dido. 

This  physical  picture  enables  one  to  interpret  the  Euler  equa- 
tion as  an  equation  of  static  equilibrium  (as  in  Sec.  12.1).  If  the 
length  L of  the  string  is  given,  then  the  potential  energy  U of  the 
system  is  proportional  to  the  surface  of  the  film,  or 

U = 2g(S0  - S) 

where  g is  the  "coefficient  of  surface  tension"  (the  factor  2 is  due 
to  the  fact  that  the  film  has  two  sides)  and  S0  is  the  area  embraced 
by  the  frame,  so  that  50  — 5 is  the  area  of  the  film.  As  we  have 
seen,  the  condition  of  static  equilibrium  reduces  to  the  stationarity 
of  the  potential  energy  Uf  which  amounts  to  the  stationarity  of  S 
(the  maximal  nature  of  S signifies  the  minimal  nature  of  U,  which 
means  equilibrium  stability).  We  thus  arrive  at  the  Euler  equation 
for  the  function  F*. 
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Fig.  175 


The  solution  that  we  have  found  can  be  obtained  from  phy- 
sical considerations.  Let  us  consider  (Fig.  175)  an  element  (dL)  of 
string  acted  on  by  the  force  2a  dL  of  surface  tension  and  the  forces 
of  tension  of  the  string  P and  P + dP  applied  to  its  endpoints. 
Projecting  these  forces  on  the  tangent  line  to  the  element  and  dis- 
carding all  terms  higher  than  first  order,  we  get 

— P + P + dP  = 0 

(bear  in  mind  that  the  cosine  of  a small  angle  differs  from  unity 
by  a second-order  quantity).  From  this  dP  — 0,  that  is,  P = con- 
stant along  the  string.  Projecting  the  forces  on  the  normal,  we  get 

2 P sin  d<x  = 2 P den  = 2a  dL  (62) 

whence 

2 2cr 

= — = constant 

dL  P 

The  ratio  constitutes  the  curvature  k of  the  string  (Sec.  9.4). 
dL  ° 

This  means  that  the  string  at  equilibrium  has  a constant  curvature, 

l P 

i.e.  it  constitutes  an  arc  of  a circle  of  radius  R = — = — . (Inciden- 

k 2(j 

tally,  the  constancy  of  P may  also  be  obtained  from  formula  (62), 
which  can  be  rewritten  as  P = 2aP  by  proceeding  from  the  solu- 
tion (61)  of  the  conditional-extremum  problem;  here  R = X,  or 
P - 2aX.) 

We  can  approach  the  soap-film  problem  in  a different  way. 
We  assume  the  tension  P of  the  string  to  be  given,  and  its  length 
L unknown ; in  other  words,  we  consider  the  scheme  shown  in  Fig.  176. 
In  this  case  the  potential  energy  of  the  system  is  equal  to 

U = constant  — 2aS  + PL  = constant  — 2a|s  ~ ~ L^ 

since  the  load  rises  by  A L when  L is  increased  by  A L,  that  is,  U 
increases  by  PAP.  We  have  thus  arrived  at  a problem  involving 
the  absolute  extremum  of  the  functional 
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Fig.  176 


We  now  have  to  set  up  the  Euler  equation  for  the  function  (60) 

P ... 

with  X = — . As  we  have  seen,  the  solution  of  this  equation  is  a 

2ct 

curve  given  by  the  equation  (61),  that  is,  an  arc  of  a circle  of  radius 

R = |X|  = -f  (63) 

Thus,  the  solution  of  a conditional-extremum  problem  in  the  new 
interpretation  may  be  regarded  as  a solution  of  the  absolute-extremum 
problem ; this  transition  has  a physical  as  well  as  a formal  meaning. 

It  is  interesting  to  note  that  in  the  second  interpretation  the 
problem  has  two  solutions:  from  (63)  we  get  the  value  of  the  radius 
of  the  curved  string,  but  there  are  two  possible  positions  of  the  string 
with  a given  radius  (Fig.  177)!  If  we  consider  the  family  of  circles 
passing  through  the  points  a and  b , the  curve  of  U as  a function 
of  h is  of  the  form  shown  in  Fig.  178.  Thus,  of  the  two  possible  po- 
sitions of  the  weight,  the  lower  one  is  stable  and  the  upper  one  is 
unstable.  Here,  as  at  the  end  of  Sec.  12.5,  the  minimum  is  of  an 
absolute  nature,  whereas  the  maximum  is  extremal  only  among 
the  arcs  of  the  circles.  Actually,  for  the  higher  position  the  energy 
has  a minimax. 

The  problem  of  Dido  can  be  generalized  by  assuming  that  it 
is  required  to  embrace  the  most  valuable  portion  of  land,  the  cost 
of  unit  area  of  land  not  being  a constant  but,  say,  dependent  on 
the  distance  from  the  sea.  This  generalized  problem  can  no  longer 
be  solved  in  an  elementary  fashion.  But  it  can  be  solved  with  the 
aid  of  Euler’s  equation  by  noting  that  here  instead  of  (59)  we  have 
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Fig.  178 
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b y 

to  maximize  the  integral  ^.F(y)  dx,  where  F(y)  = ^ 9(73)  dy  and 

a 0 

9(7])  is  the  cost  of  unit  area  at  a distance  73  from  the  sea.  (If  9(73)  = 1, 
then  we  come  back  to  (59).) 

Let  us  consider  another  conditional-extremum  problem : the 
problem  of  the  distribution  of  charges  in  a conductor.  Imagine  an 
isolated  conductor  (fi)  of  arbitrary  shape  charged  with  q electricity. 
This  quantity  is  arranged  on  the  conductor  with  density  p,  which 
it  is  required  to  find. 

To  solve  this  problem,  recall  formula  (23)  of  Ch.  10  for  a poten- 
tial generated  by  a distributed  charge,  which  we  rewrite  as 


?(f)  = ( 0 
j * - ro  I 
(O) 

Therefore  the  total  potential  energy  of  the  charge  is 

U=\  $ <?(!)  dq=\  \ ?(r)p(r)rfQ 

(Cl)  CO) 

= _lf  f P(ro)  pto  dQ0 

2 J J ir  - ro  I 

(Cl)  (Q) 

By  the  general  principle  of  static  equilibrium,  this  integral  must 
assume  a minimal  (or,  at  any  rate,  a stationary)  value  provided 


^ p(r)  dQ  = q (=  constant) 


(Cl) 


since  charges  can  only  be  redistributed  but  cannot  be  created  or 
made  to  vanish.  What  we  have,  therefore,  is  a conditional-extre- 
mum problem.  To  solve  it,  equate  to  zero  the  variation  S(U  — \q), 
where  X is  a Lagrange  multiplier. 

But  by  the  formula  for  the  differential  of  a product. 


= ^ $p(*o)p(r)  dCldCl o , 1 f f p(r0)  Sp(r)  dQ  dQ0 


+ 


to)  m 


iSS 

(O)  (Cl) 


If  in  the  first  integral  we  denote  r by  r0  and  r0  by  r,  then  it  is  equal 
to  the  second  integral,  whence 

— X ^ 8p(r)  d£l 

(Q) 

= $ [?(r)  - >0  8p(r)  dO. 

Since  Sp(r)  is  arbitrary,  we  get  9(f)  — X = 0,  or  9(f)  = X = constant. 


0 = 8(£7  - X?)  = $ $ 

(Cl)  (Cl) 


p(r0)  Sp(r IdOdOp 
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Fig.  179 


Summarizing,  we  see  that  a free  charge  in  a conductor  distributes 
itself  so  that  the  potential  90  is  the  same  at  all  points  of  the  conductor. 
It  is  also  evident  that  the  Lagrange  multiplier  in  this  problem  is 
equal  to  o0  (derive  this  last  result  from  the  formula  (56)).  The  con- 
stancy of  the  potential  could  also  have  been  derived  by  physical 
reasoning:  if  the  potential  were  different  at  distinct  points,  then 
the  freely  moving  charges  in  the  conductor  would  flow  from  high- 
potential  regions  to  regions  of  lower  potential,  so  that  under  these 
conditions  static  equilibrium  is  impossible. 

Since  the  potential  is  connected  with  charges  by  the  familiar 
Poisson  equation  (43)  of  Ch.  10,  we  can  draw  the  conclusion  from  the 
constancy  of  the  potential  that  p = 0 inside  the  conductor,  which 
means  that  the  entire  charge  is  located  on  the  surface  of  the  con- 
ductor. Incidentally,  this  conclusion  could  have  been  made  on  the 
basis  of  physical  reasoning. 

When  we  pass  through  the  surface  of  the  conductor,  the  poten- 
tial has  a “corner”  on  the  graph  (Fig.  179),  which  means  a disconti- 
nuity in  the  first  derivative.  If  we  choose  the  coordinate  axes  so  that 
the  origin  lies  at  a certain  point  of  the  conductor  and  the  z-axis  is 
in  the  direction  of  the  outer  normal  to  the  surface  of  the  conductor, 
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then  the  derivative  — will  have  a jump  — tan  a.  Therefore  , and 

dz  dz~ 

with  it  A 9,  will  have  a delta-like  singularity  of  the  form  — tan  a S(z). 
On  the  other  hand,  if  the  charge  is  located  on  the  surface  with  density  a, 
then  near  the  point  at  hand,  p = <7&(z).  Therefore  the  equation  (43) 
of  Ch.  10  yields  — tan  oc8(z)  = — 4.tzg§(z),  or  tan  a = 4tzg. 

It  is  to  be  noted  in  conclusion  that  solving  a conditional-extremum 
problem  always  signifies  the  solution  of  a certain  “companion”  con- 
ditional extremum  problem.  For  example,  in  the  problem  of  Dido 
we  get  the  same  answer  if  we  seek  the  minimum  length  of  a string 
enclosing  a given  area.  Indeed,  if  a noncircular  contour  (Zx)  enclosed 
the  same  area  as  the  circle  ( L ),  Lx  < Z,  then  by  proportionally  increas- 
ing the  contour  (Zx),  we  could  obtain  the  contour  (Z2),  Z2  = Z, 
enclosing  a greater  area  than  (Z),  but  this  runs  counter  to  the  solution 
of  Dido's  problem.  Similarly,  in  the  potential  problem  we  could  seek 
the  maximum  charge  that  fits  on  a conductor  for  a given  value  of 
the  potential  energy.  In  the  general  case,  it  turns  out  that  the  pro- 
blem of  the  stationary  value  of  A , provided  B is  constant,  and  the 
problem  of  the  stationary  value  of  B,  provided  A is  constant,  lead 
to  the  same  solutions.  This  is  quite  true  because  they  reduce  to  pro- 
blems of  the  stationarity  of  A -f  \B  or  B + [iA  = y.(A  + XjZ)  (Xx  = 
= 1/jjl).  That  is  to  say,  since  ^ is  constant,  of  A + XjS.  Hence  the 
two  problems  are  equivalent. 

Exercises 

1.  Find  a solid  of  revolution  of  the  largest  volume  having  a given 
surface  area. 

2.  Find  the  shape  of  equilibrium  for  a homogeneous  heavy  string 
suspended  at  the  endpoints. 

Hint.  The  string  must  lie  so  that  its  centre  of  gravity  is  as  low 
as  possible. 

3.  In  the  foregoing  problems,  indicate  the  meaning  of  the  Lagrange 
multiplier. 

12.9  Extremum  problems  with  restrictions 

We  deal  here  with  another  class  of  extremum  problems  that 
have  come  to  the  fore  in  recent  years.  These  are  problems  in  which 
the  desired  quantities  or  the  desired  function  are  restricted  by  ac- 
cessory conditions  in  the  form  of  inequalities. 

We  start  with  the  simplest  case.  Suppose  we  are  seeking  the 
extremum  of  the  function  y = f(x),  the  independent  variable  x being 
restricted  to  the  range  given  by  the  inequalities  a ^ x ^ b (Fig.  180). 
We  then  have  the  possibility  of  inner  extrema  (a  maximum  at  * = 
— c in  Fig.  180  and  a minimum  a t x = d)  and  also  of  endpoint  ex- 
trema (the  endpoint  minimum  at  x = a and  the  endpoint  maximum 
at  x = b).  For  finding  inner  extrema,  use  can  be  made  of  the  statio- 
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narity  condition  y ~ 0,  whereas  for  the  endpoint  extrema  this 
condition  does  not  hold,  which  is  to  say  that  the  latter  are  in  the 
nature  of  “cuspidal”  extrema  (cf.  HM,  Sec.  4.2.). 

Suppose  we  are  now  looking  for  the  extremum  of  a function  of 
two  variables  0 = f{x}y)  and  the  independent  variables  are  connected 
by  the  restriction  F(x,y)  ^ 0.  We  assume  that  this  inequality  defines, 
in  the  #y-plane  of  the  arguments,  a certain  finite  region  (S)  with 
boundary  (L)  (Fig.  181)  on  which  F = 0.  The  function  / can  have 
both  inner  extrema  within  (S)  and  boundary  extrema  on  (L).  To  find 
the  former,  we  can  use  the  conditions  of  stationarity  discussed  in 
Sec.  4.6.  However,  as  in  the  one-dimensional  case,  these  conditions 
fail  for  the  boundary  extrema.  To  find  the  boundary  extrema,  note 
that  if  the  function  / has  an  extremum  (say,  a maximum)  at  some 
point  M (Fig.  181),  then  the  value  of  f(M ) is  less  than  all  values  of  / on 
(L)  near  M.  Therefore,  at  the  same  time,  / is  a minimum  at  M pro- 
vided F = 0,  and  such  points  can  be  sought  out  by  the  method  of 
Sec.  12.7.  And  so  the  boundary  extrema  are  found  by  the  rule  for 
finding  conditional  extrema. 

Extremum  problems  with  restrictions  are  also  encountered  in 
the  calculus  of  variations.  By  way  of  an  illustration  we  consider  the 
problem  of  the  curve  of  quickest  descent  that  was  solved  in  Sec.  12.4, 
but  let  us  impose  the  restriction  that  the  particle  in  the  descent  is 
not  allowed  to  fall  below  the  terminal  point,  that  is,  the  desired 
solution  is  restricted  by  the  inequality  y ^ yb.  We  draw  a cycloid 
with  cusp  at  the  initial  point  A (Fig.  182),  and  the  sough t-for  curve 
must  lie  entirely  in  the  hatched  area.  It  is  clear  that  if  the  terminal 
point  lies  to  the  left  of  B,  say,  at  Bv  then  the  desired  curve  is  the 
arch  AB±  of  the  cycloid,  since  this  solution  is  the  best  of  all  curves 
and  also  satisfies  the  stated  inequality.  But  what  will  happen  if  the 
terminal  point  lies  to  the  right  of  B , say  at  Bz ? 
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Fig.  182 


y 


X 


First  of  all,  observe  that  the  portion  of  the  desired  curve  which 
lies  strictly  inside  (not  on  the  boundary)  of  the  hatched  area  satis- 
fies the  Euler  equation  and  therefore  is  a cycloid.  This  is  true,  for  if 
the  entire  curve  serves  as  the  solution  of  the  problem,  then  any  arc 
of  the  curve  is  the  curve  of  quickest  descent  between  the  endpoints.* 
Thus,  the  desired  curve  may  consist  of  the  integral  curves  of  the 
Euler  equation  (which  are  called  extremals)  and  the  curves  lying 
on  the  boundary  of  the  hatched  area.  But  now  we  see  that  the  desired 
curve  consists  of  the  arc  AB  of  the  cycloid  and  the  line  segment  BB2. 
Indeed,  take  any  other  possible  curve,  say  AB1BB2 . Then  on  its 
segment  ABXB  as  well  it  must  be  a solution  of  the  minimum  problem, 
which  is  not  so,  for  the  arc  of  the  cycloid  'AB  yields  a shorter  time  of 


An  arc  A C between  A and  any  point  C situated  between  A and  B is  the  curve 
of  quickest  descent  with  zero  initial  velocity,  which  is  to  say  it  solves  the  same 
problem  as  was  posed  in  determining  AB.  The  arc  between  the  two  intermediate 
points  C'  and  C"  solves  the  problem  of  quickest  descent  for  a given  nonzero 
initial  velocity  at  C'  corresponding  to  the  difference  in  the  altitudes  of  A and  C'. 
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descent.  It  is  readily  seen  that  we  have  to  take  the  arc  of  precisely 
that  cycloid  for  which  B is  the  most  rightwards  one,  i.e.  the  cycloid 
tangent  to  the  lower  straight  line. 

Thus,  the  solution  of  a variational  problem  with  restrictions 
may  only  partially  satisfy  the  condition  of  stationarity  (Euler's 
equation)  or  not  at  all.  As  in  the  case  of  a function  of  one  variable, 
this  solution  is  in  the  nature  of  a cuspidal  extremum. 

Extremum  problems  with  restrictions  also  arise  in  control  pro- 
cesses. For  example,  the  problem  may  be  one  of  controlling  the  motion 
of  a vehicle  in  space  by  turning  rudders  so  as  to  set  the  vehicle  on  a 
given  flight  path  in  the  shortest  possible  time.  Since  what  is  sought 
is  the  law  of  rotation  of  the  rudders  in  time,  we  have  a varia- 
tional problem.  And  if  the  rudders  have  turning  limiters  (designed, 
say,  for  restricting  centrifugal  forces),  then  we  have  a problem  with 
restrictions.  In  an  optimal  control  process  obtained  by  solving  a 
variational  problem,  there  are  time  intervals  during  which  the  rud- 
ders lie  on  the  limiters,  that  is,  the  solution  passes  along  the  boundary 
of  the  region  in  which  it  can  reside.  On  approaching  this  boundary, 
the  solution  must  satisfy  definite  conditions  that  we  will  not  con- 
sider here. 

Exercises 

1.  Find  the  maximal  and  minimal  values  of  the  function  £ = x2 — 
— y2  — x + y in  a triangle  bounded  by  the  straight  lines 
x = 0,  y = 6,  x + y = 2. 

2.  Find  the  position  of  a heavy  homogeneous  string  of  length 
L > 2 suspended  at  the  points  ( — 1,  1)  and  (1,  1)  if  the  string 
is  located  in  the  upper  half-plane  y ^ 0. 

12.10  Variational  principles.  Fermat’s  principle  in  optics 

Besides  separate  problems  of  a variational  nature,  examples 
of  which  have  been  discussed  in  the  foregoing  sections,  there  are 
variational  principles,  each  of  which  is  applicable  to  analyzing  a 
broad  class  of  phenomena.  Each  of  these  principles  ordinarily  as- 
serts that  of  all  states,  processes,  and  the  like,  that  are  admissible 
for  the  given  constraints  or  restrictions  in  the  physical  system  under 
consideration,  only  that  state  or  process,  etc.,  is  materialized  for 
which  a certain  functional  (quite  definite  for  the  given  principle) 
assumes  a stationary  value.  Each  of  these  principles  makes  pos- 
sible a uniform  consideration  of  many  problems  embraced  by  a single 
theory,  and  for  this  reason  it  often  happens  that  such  a principle  is 
a substantial  scientific  attainment. 

One  of  the  most  important  and  general  principles  of  physics 
has  already  been  demonstrated  in  a large  number  of  cases.  It 
consists  in  the  following:  of  all  possible  positions  of  a physical 
system  admissible  under  given  constraints,  the  equilibrium  position 
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is  that  for  which  the  potential  energy  of  the  system  at  hand  has  a 
stationary  value  (the  value  is  minimal  for  a stable  equilibrium  of 
the  system).  If  the  system  has  a finite  number  of  degrees  of  freedom, 
the  potential  energy  can  be  expressed  as  a function  of  a finite  number 
of  generalized  coordinates  (see  Sec.  4.8),  so  that  the  application 
of  the  indicated  principle  reduces  to  finding  the  minimum  of  a 
function  of  several  variables  (see  Sec.  4.6).  But  if  we  consider  a 
continuous  medium  whose  state  is  described  by  a certain  function 
of  the  coordinates  (or  by  a system  of  such  functions),  then  the 
potential  energy  is  a functional,  and  application  of  this  principle 
is  carried  out  on  the  basis  of  the  methods  of  the  calculus  of  varia- 
tions, just  as  we  have  already  done. 

Historically,  one  of  the  first  variational  principles  was  that  of 
Fermat  in  optics.  It  states  that  of  all  conceivable  paths  joining 
two  specified  points,  a light  ray  chooses  the  path  it  can  traverse  in 
minimum  time.  To  write  down  the  Fermat  principle  mathematically, 
recall  that  the  velocity  of  light  c in  a vacuum  has  a very  definite 

value,  whereas  in  a transparent  medium  the  velocity  is  — , where 

n 

n > 1 is  the  index  of  refraction  of  the  medium,  and  generally 
depends  not  only  on  the  medium  but  also  on  the  wavelength  of 
the  light.  For  the  sake  of  simplicity,  we  consider  a case  where  n 
does  not  depend  on  the  wavelength  and  so  will  have  a definite 
value  in  every  medium.  Since  light  covers  a path  dL  in  time  dt  — 

= dL  : — = dL,  it  follows  that  the  time  t during  which  the  given 

n c 

path  ( L ) is  traversed  is  found  from 

t = — ^ ndL  (64) 

(L) 

Thus,  we  have  a functional  that  depends  on  the  choice  of  path 
connecting  two  points,  so  that  the  problem  of  finding  the  shape  of 
the  light  ray  is  a variational  problem. 

The  basic  laws  of  propagation  of  light  can  be  derived  from  the 
Fermat  principle. 

For  example,  in  the  case  of  the  propagation  of  light  in  a 
medium  with  a constant  optical  density , i.e.  with  a constant  velocity 
of  light,  the  Fermat  principle  leads  us  to  conclude  that  light  is  pro- 
pagated in  a straight  line  (Fig.  183,  where  5 denotes  the  light  source 
and  A the  point  of  observation),  since  the  straight  line  SA  is  the 
shortest  line  connecting  S and  A . 

From  this,  in  turn,  it  follows  that  if  the  light  path  consists  of 
several  sections  (for  example,  between  successive  reflections,  refrac- 
tions, and  so  forth),  each  of  which  is  in  a medium  of  constant 
optical  density,  then  each  of  these  sections  is  a straight-line  seg- 
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ment  and  the  entire  path  is  a broken  line.  To  determine  the  path 
length,  it  is  required  to  find  the  coordinates  of  the  vertices  of  the 
broken  line,  which  is  to  say  that  there  is  only  a finite  number  of 
degrees  of  freedom.  For  this  reason  such  problems  are  solved  by 
the  tools  of  differential  calculus.  In  particular,  by  solving  the  minimum 
problem,  it  is  easy  in  this  way  to  derive  the  laws  of  reflection  of 
light  from  a plane  surface  and  the  refraction  of  light  passing  through 
a plane  interface  of  two  media  (see  HM,  Sec.  4.1). 

We  see  that  Fermat's  principle  does  indeed  lead  to  conclusions 
that  are  in  full  agreement  with  experiment  as  concerns  the  reflection 
and  refraction  of  light. 

Now  let  us  consider  the  reflection  of  light  from  a curved  surface 
tangent  to  the  plane  at  the  point  0 (Fig.  184).  The  figure  shows 
two  examples  of  surfaces  bent  in  opposite  directions : 101,  concave 
downwards,  IIOII,  concave  upwards  from  the  #-axis.  (We  consider 
cylindrical  surfaces  with  generatrices  perpendicular  to  the  plane 
of  the  drawing). 

It  can  be  shown  that  it  suffices  here  to  consider  the  rays  that 
lie  in  the  plane  of  the  drawing,  and  sections  of  reflecting  surfaces 
by  the  plane  of  the  drawing.  For  this  reason,  in  the  future  we 
will  not  speak  about  a reflecting  surface  but  about  a reflecting  straight 
line,  and  not  about  reflecting  curved  surfaces,  but  about  the  reflect- 
ing curves  101  and  IIOII  in  the  plane  of  Fig.  184. 

We  can  consider  the  problem  without  any  concrete  computa- 
tions. It  is  known  (see  HM,  Sec.  3.17)  that  the  distance  between 
a curve  and  a tangent  line  is  proportional  to  (x  — x0)2,  where  x0 
denotes  the  abscissa  of  the  point  of  tangency  0. 

Let  us  consider  the  lengths  of  the  broken  lines  Lst(50st^l)> 
Li(S0iA)t  and  in  (SOn-4).  These  broken  lines  are  not  shown  in  Fig.  184 
so  as  not  to  clutter  up  the  drawing;  the  points  0st,  0lf  0n,  as 
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can  be  seen  in  the  drawing,  lie  to  the  right  of  the  point  of  tan- 
gency  0 for  one  and  the  same  value  of  x;  here,  05t  lies  on  tin- 
straight  line,  Oi  on  the  lower  line  I,  On  on  the  upper  line  II.  ll 
we  see  the  points  S,  A,  0st,  0I(  Ou  in  the  drawing,  it  is  easy  to 
imagine  the  broken  lines. 

The  abscissas  of  0st,  Oj,  0n  are  the  same.  The  ordinates  of 
0st,  Oi  and  Ou  differ  by  a quantity  proportional  to  ( x — x0)2. 
Hence,  the  lengths  of  Lst,  Li,  Lu  also  differ  by  a quantity  propor- 
tional to  (x  — v0)2.  Let  us  write  the  expansions  of  Lst,  Li  and  Ln 
in  a Taylor  series  in  powers  of  (x  — x0) : 

L*{x)  = Lit(x0)  + (x  — x0)  L'st{x 0)  + - Ls't(x0)  + 

Li(x)  = Lt(x0)  +(x-  x0)  L'i(x0)  + {*  ~ L'{(x 0)  + .... 

Ln(x)  = Ln(x0)  +(x~  x0)  L'n(x 0)  + L'u(x0)  + ... 

Since  Lst,  Lx  and  Lu  differ  only  by  a quantity  of  the  order 

of  (x  — xQ)2t  this  means  that 


At(^o)  — ^i(^o)  — Ln(x0), 

(65) 

L't(x0)  = Li(x0)  = Lu(x 0)  ( = 0), 

(66) 

^st(#o)  ^ *^1  (xo)  ^ Ln(x0) 

(67) 

The  first  of  these  equations,  (65),  are  obvious  consequences 
of  the  fact  that  all  three  curves  — the  straight  line  st,  the  curve  I, 
and  the  curve  II  — pass  through  the  single  point  O,  x = x0 . 

The  second  set  of  equations,  (66),  result  from  the  fact  that  at 
the  point  0 the  three  indicated  curves  are  tangent  to  one  another, 
and  the  angle  of  incidence  of  a ray  is  equal  to  the  angle  of  reflection. 

We  know  that  if  the  derivative  is  zero,  this  means  that  Zst, 
Z,z  and  Ljj,  as  functions  of  x , can  have  a minimum  or  a maximum 
at  a;  = x0.  Whether  this  is  a minimum  or  a maximum  depends 
on  the  sign  of  the  second  derivative. 

For  the  straight  line,  we  have  a minimum,  L's t'  > 0.  However, 
from  (67)  we  see  that  we  cannot  draw  this  conclusion  for  the  curves  I 
and  II.  In  particular,  for  sufficient  curvature  of  curve  II,  its  length 
Lu  has  precisely  a maximum  at  0 for  x = x0  and  not  a minimum. 
Because  the  curve  II  rises  to  the  left  and  right  of  0,  the  path  SOuA 
is  shorter  than  SO  A and  the  path  SCh4  is  the  longest  of  all  adjacent 
paths  from  point  S to  any  point  of  the  curve  II  and  thence  to 
point  A . 

Experiment  shows  that  in  the  case  of  curve  II  as  well  the  reflec- 
tion occurs  at  the  point  0 , that  is,  at  the  point  where  the  length 
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nl  the  path  has  a maximum;  it  is  obvious  that  at  this  point  the 
.mgle  of  incidence  is  still  equal  to  the  angle  of  reflection. 

The  fact  that  the  path  length  may  not  be  a minimum  but  a 
maximum  is  very  important  for  understanding  the  meaning  and 
origin  of  the  Fermat  principle.  It  is  obvious  that  this  principle  is 
not  the  consequence  of  any  “striving”  on  the  part  of  light  to  choose 
shortest  routes:  if  there  were  such  a “desire”  on  the  part  of  light, 
1 1 would  not  be  indifferent  to  whether  we  were  dealing  with  a minimum 
or  a maximum. 

Actually,  Fermat's  principle  is  a consequence  of  the  fact  that 
the  propagation  of  light  is  a propagation  of  waves  in  which  the  elec- 
tric and  magnetic  fields  rapidly  change  sign  (1015  times  per  second 
for  visible  light).  Accordingly,  the  wavelength  of  light  is  very  small. 
At  a given  instant,  the  signs  of  the  field  are  opposite  at  points  a 
half-wavelength  apart.  Suppose  at  a certain  time  the  field  has  a 
negative  sign  at  the  light  source  S.  The  sign  of  the  field  at  A depends 
on  how  many  times  the  sign  can  change  over  the  route  from  S to  A . 
If  we  consider  two  neighbouring  routes  from  S to  A,  then  for  the 
same  path  length  those  fields  that  arrive  at  A from  S along  these 
routes  will  have  the  same  sign ; they  will  combine  and  reinforce 
each  other. 

If  the  path  lengths  are  distinct,  then  the  fields  can  either  have 
the  same  sign  or  different  signs  and,  on  the  average,  they  cancel 
out.  Herein  lies  the  reason  why,  in  the  propagation  of  light,  an  impor- 
tant role  is  played  by  beams  of  rays  of  the  same  wavelength  and 
with  the  same  time  of  propagation. 

That  the  derivative  of  the  path  length  with  respect  to  the  coordi- 
nate of  the  point  of  reflection  is  equal  to  zero  means  that  at  least 
to  within  terms  of  the  order  of  {x  — x0)2  the  path  lengths  are  the 
same,  the  waves  that  go  through  point  0 and  the  adjacent  waves 
are  reinforced.  From  this  point  of  view  it  is  obviously  immaterial 
whether  we  deal  with  a minimum  or  a maximum,  L”  > 0 or  L"  < 0. 
What  is  more,  it  is  clear  that  the  best  conditions  for  reflection  are 
obtained  when  the  path  length  is  the  same  over  as  large  a portion 
of  the  reflecting  surface  as  possible.  Which  means  that  it  is  desira- 
ble that  L"  — 0 ; then  the  dependence  of  L on  x for  small  % is  still 
weaker  — only  in  the  terms  (x  — #0)3  and  higher.  Physically  speak- 
ing, L”  = 0 corresponds  to  a curvature  of  the  reflecting  line  for 
which  a mirror  collects  (focusses)  at  the  point  A the  rays  emanating 
from  S. 

In  geometrical  optics,  rays  are  regarded  as  geometric  lines. 
We  have  just  seen,  in  the  example  of  reflection,  that  the  laws  of 
geometrical  optics  are  a consequence  of  the  wave  nature  of  light. 
Instead  of  one  single  ray  we  considered  a beam  of  neighbouring  rays. 

We  can  give  an  estimate  of  the  thickness  of  this  beam.  The 
amplitudes  of  rays  whose  path  length  from  S to  A differs  by  less 
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than  one  half  of  the  wavelength  combine.  We  set  up  the  conditions : 
| L(x)  — L(x 0)  | = I X.  Near  the  point  of  reflection,  L'(x0)  = 0, 

; L(x)  — L(x0)  | = - (X  — xoy  I L"(x0)  I,  whence  | * — = |/ |~z."(^a)  | ' 

Thus,  in  reality  the  light  is  not  reflected  in  a single  point  of  the 
mirror,  but  on  a spot,  the  dimensions  of  which,  as  may  be  seen  from 
the  formula,  only  tend  to  zero  as  the  wavelength  approaches  zero. 

Using  Fermat’s  principle,  one  can  consider  the  more  compli- 
cated problem  of  the  shape  of  a light  ray  in  a medium  with  a gra- 
dually varying  optical  density,  for  example,  the  light  ray  coming 
from  a star  to  an  observer  when  account  is  taken  of  the  gradual 
variation  in  the  density  of  the  atmosphere.  In  this  case,  the  ray 
will  proceed  by  a curvilinear  path  and  will  be  determined  from 
the  condition  of  the  stationary  value  (ordinarily  a minimum)  of 
the  integral  (64).  In  certain  cases,  integration  of  the  appropriate 
Euler  equation  can  be  carried  to  quadrature.  Consider  for  example 
the  propagation  of  light  in  the  #y-plane  if  the  velocity  of  light  depends 
only  on  the  altitude  v,  that  is,  n = n(y).  Rewriting  (64)  as 


ct 


\^n(y)]j  1 + y'2  dx 

(L) 


we  can  take  advantage  of  the  intermediate  integral  (34)  of  Euler’s 
equation,  which  yields 


n 


(y)P 


n(y) 


y' 


V i + y'- 


From  this  it  is  easy  to  find 


, = £y  __  I /[*(y)]2  - Cf  ( 
dx  Cx 


y = Clt  or 


C,  dy 


My) 


y l+y'2 


C 1 


VlMy)?  - c{ 


- - dx, 


c'$ 


dy 


V[n(y )]2  ~ C\ 


— x + C2 


If  it  is  possible  to  take  the  last  integral  for  some  special  form  of  the 
function  n(y),  we  get  the  equation  of  the  light  ray  in  closed  form, 
otherwise  the  integral  can  be  computed  in  approximate  fashion 

(Ch.  1).  It  is  interesting  to  note  that  for  n = _*!g=r-  we  have  a 

Vy0  - y 

problem  that  was  solved  in  Sec.  12.4  (light  is  propagated  along 
cycloids)  and  for  n = n0y  we  have  a problem  solved  in  Sec.  12.5 
(light  is  propagated  along  catenaries). 
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Note  that  in  the  general  case  where  the  refractive  index  depends 
on  the  light  frequency,  it  is  necessary  to  distinguish  between  the 
group  velocity  and  the  phase  velocity  of  light  in  the  medium. 

The  phase  velocity  of  light , cph,  gives  the  following  relationship 
between  the  wavelength  and  the  oscillation  period: 

>■  = cphr 

Here,  all  fields  are  harmonically  dependent  on  x and  t: 

H,  Ecccos(^-i^)  = cos  (cvht  - *)] 

(cc  is  the  variation  symbol).  The  quantity  under  the  cosine  symbol 
may  be  called  the  phase,  whence  the  term  “phase  velocity”.  The 
refractive  index  gives  a direct  description  of  the  phase  velocity, 

c h = — , where  c is  the  velocity  of  light  in  vacuum. 

n 

We  must  distinguish  between  cph  and  the  so-called  group  velo- 
city of  light , cgr.  The  group  velocity  is  involved  in  the  answer  to 
the  question : What  is  the  time  lapse  for  a light  ray  travelling  from 
point  1 , after  a shutter  is  opened,  to  point  2 in  a homogeneous  medium  ? 
The  answer  is  / = r12/rgr,  where  cgr  is  expressed  in  terms  of  the 
refractive  index  thus: 


c 

C%x  dn 

« + co  — 
did 

Hence  the  expression  cST  is  equal  to  cvh  only  in  the  particular  case 
where  n does  not  depend  on  to,  cgr  = cph;  in  the  general  case  they 
differ. 

For  example,  the  refractive  index  is  less  than  1 for  X-rays  in 

the  medium,  n — 1 — — . Hence  cnh  > c.  But  this  does  not  mean 
co2  ph 

that  a signal  can  be  propagated  with  a velocity  greater  than  that 
of  light!  For  it  is  easy  to  see  that 


. dn  . a . 2a  1 . 

tl  -j-  (O  — 1 — -f~  6)  = 1 -}- 


so  that  cgT  < c. 

Experiment  shows  that  Fermat's  principle  involves  just  n and 
cph;  for  this  reason  it  is  once  again  clear  that  the  Fermat  principle 
is  not  the  result  of  a tendency  on  the  part  of  light  to  get  from  one 
point  to  another  in  the  shortest  possible  time.  The  matter  lies  in  the 
condition  under  which  the  waves  combine,  and  this  depends  on  the 
phase  of  the  wave. 

The  difference  between  minimality  and  stationarity  of  the 
value  of  a functional  in  a real  process  is  illustrated  very  well  in  the 
following  simple  example  taken  from  a paper  entitled  “The  Principle 


32 


1634 
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Fig.  185 


of  Least  Action”  [13]  by  the  celebrated  German  theoretical  phy- 
sicist Max  Planck  (1858-1947).  The  path  of  a free  particle  moving 
on  a surface  without  application  of  external  forces  is  the  shortest 
route  connecting  the  initial  and  terminal  points  of  the  path.  On  a 
sphere,  this  is  an  arc  of  a great  circle.  But  if  the  path  is  longer  than 
half  the  circumference  of  a great  circle,  then,  as  can  readily  be  seen, 
the  length,  though  stationary,  will  not  be  minimal  compared  with 
the  lengths  of  neighbouring  routes.  Neither  will  the  value  be  a maxi- 
mum, for  by  introducing  zigzags  it  is  possible  to  leave  the  initial 
and  terminal  points  and  yet  increase  the  path  length.  This  value  is 
in  the  nature  of  a minimax  (see  Sec.  4.6).  Recalling  the  once  current 
idea  that  nature  is  governed  by  God's  handiwork  and  that  under- 
lying every  phenomenon  of  nature  is  a conscious  intention  directed 
at  a specific  purpose  and  that  this  purpose  is  attained  by  the  shortest 
route  and  via  the  best  means,  Planck  ironically  observed:  “Hence 
Providence  is  no  longer  operative  beyond  the  limits  of  a semi- 
circle.” 

Exercises 

1.  The  equation  of  an  ellipse  is  known  to  be  derived  from  the 
condition  that  the  sum  of  the  distances  of  any  point  K from 
two  points  F1(x1  = — c,  y = 0)  and  F2  (x2  = c,  y = 0),  that 
is,  the  sum  FXK  + F2K  = L,  is  the  same  for  all  points  (Fig.  185). 
Find  the  tangent  and  the  normal  to  an  ellipse  at  an  arbitrary 
point  K.  Find  the  angles  between  the  lines  F ±K,  F2K  and  the 
normal  to  the  point  K . Show  that  all  rays  emanating  from  Fx 
pass  through  Flatter  reflection  and,  hence,  come  to  a focus  at  F2. 

2.  Consider  the  reflection  from  the  parabola  y = x*  of  a parallel 
beam  of  light  going  downward  along  the  axis  of  ordinates 
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Fig.  186 


(Fig.  186).  Assuming  that  the  light  moves  from  Ay  xA  — 0,  yA  = Y 
and  regarding  Y as  large,  replace  by  Y — yK  the  distance  AK 
equal  to  )f xj.  + (Y  — yK)2.  Find  the  length  L = AKN  and 

find  the  point  N for  which  -^L—  0.  Find  the  tangent  and 

dxK 

the  normal  to  the  parabola  at  the  point  K and  convince  yourself 
that  for  N the  condition  of  the  angle  of  incidence  being  equal 
to  the  angle  of  reflection  holds  true. 

3.  The  points  S and  A are  located  1 cm  above  a mirror  and  are 
separated  by  2 cm.  Find  L"(x 0)  and,  for  X = 5.10-5  cm,  find 
the  dimensions  of  the  reflecting  region. 

12.11  Principle  of  least  action 

The  success  of  the  universal  principle  of  the  minimum  of  potential 
energy  used  to  determine  the  position  of  equilibrium  of  a system 
stimulated  searches  for  an  analogous  universal  principle  with  the 
aid  of  which  it  might  be  possible  to  determine  possible  motions  of 
a system.  This  led  to  the  discovery  of  the  “principle  of  least 
action”. 

Let  us  first  consider  a special  case.  Suppose  a particle  of  mass 
tn  is  in  motion  along  the  %-axis  under  the  action  of  a force  with  poten- 
tial U(x).  As  we  know  (see  HM,  Sec.  6.8),  the  equation  of  motion 
is  then  of  the  form 
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It  is  easy  to  choose  a functional  for  which  the  last  equation  is  just 
the  Euler  equation  (Sec.  12.4).  Denoting—  = xf  we  rewrite  (68) 


in  the  form 

— + — (mx)  = 0,  or 

dx  dt  ' 


d(~  U)  d q 

dx  dt  [ dx  \ 2 JJ 


(69) 


In  form,  the  last  equation  resembles  the  Euler  equation,  which  in 
the  case  of  the  desired  function  x(t)  should  have  the  form  (cf.  (29)) 


— F(t,  x,  x) 

dx  v ' 


F{t,  x,  a;)]  = 0 

dx  J 


(70) 


However,  in  (70)  the  same  function  is  differentiated  in  both  members, 
whereas  this  is  not  so  in  equation  (69).  So  let  us  have  (69)  take  on 
the  aspect  of  (70)  by  adding,  under  the  derivative  signs,  terms  whose 
derivatives  are  equal  to  zero: 

We  see  (cf.  formulas  (25)  and  (29))  that  the  desired  functional  is 

s=\\^--u{X)yt 

Observe  that  the  term  — mx2  is  just  equal  to  the  kinetic  energy 
T of  the  moving  particle ; denote  the  integrand  as 

L = T — U (71) 


This  is  the  so-called  Lagrangian  function.  Then  the  variational  pro- 
blem consists  in  seeking  the  stationary  value  of  the  integral 

* 2 

S = ^Ldt  (72) 

which  is  called  the  action.  Here,  tx  and  t2  are  the  initial  and  terminal 
time  of  the  motion  and  we  compare  all  conceivable  forms  of  motions 
with  the  same  initial  conditions  and  the  same  terminal  conditions. 
It  can  be  verified  that  in  a large  number  of  cases,  for  instance  if 
the  time  interval  to  — tx  is  sufficiently  small,  the  integral  (72)  has 
a minimal  value  and  not  merely  a stationary  value  for  actual  motion. 
For  this  reason,  the  possibility  of  finding  that  motion  by  proceeding 
from  the  variational  problem  for  the  integral  (72)  is  called  the  principle 
of  least  action. 
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It  turns  out  that  the  variational  principle  of  least  action  is 
of  a universal  nature  and  holds  true  for  any  closed  systems  not  involv- 
ing dissipation  of  energy,  for  example,  via  friction;  incidentally, 
a system  with  dissipation  may  in  a certain  sense  be  considered  open. 
According  to  this  principle,  of  all  conceivable  (under  the  given  con- 
straints) modes  of  passing  from  one  state  at  time  tt  to  another  state 
at  time  t2,  the  system  chooses  that  mode  for  which  the  action,  that  is, 
the  integral  (72),  assumes  a stationary  (minimal,  as  a rule)  value. 
Here  the  Lagrangian  function  L is  the  difference  (71)  between  the 
kinetic  energy  and  the  potential  energy  of  the  system,  each  of  these 
energies  being  expressed  in  terms  of  the  generalized  coordinates  of 
the  system  (see  Sec.  4.8)  and  the  time  derivatives.  The  principle 
of  least  action  is  applicable  both  to  systems  with  a finite  number 
of  degrees  of  freedom  and  to  continuous  media,  and  not  only  to  mecha- 
nical but  also  electromagnetic  and  other  phenomena. 

Let  us  apply  the  principle  of  least  action  to  deriving  the  equation 
of  small  transverse  oscillations  of  a taut  membrane  that  was  consi- 
dered at  the  end  of  Sec.  12.6.  Since  the  kinetic  energy  of  an  element 
do)  of  the  membrane  is  equal  to 


where  p is  the  surface  density  of  the  membrane,  the  total  kinetic 
energy  of  the  membrane  is 

(o) 

Using  the  expression  (43)  for  the  potential  energy,  we  get  the  Lagran- 
gian function  and  the  action: 


Applying  the  Euler  equation  with  respect  to  the  three  independent 
variables  (cf.  Sec.  12.6),  we  get  the  equation  of  oscillations  of  a mem- 
brane: 


— f p if|  + _L(r  — ] +— It—  ) = o 

dt\  dt)  dx\  dx ) dy\  dy) 


d*z  T 


or 


P 


d2Z  , d2Z 

dx 2 dy 2 
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At  first  glance,  expression  (71)  appears  to  be  formal  and  unna- 
tural for  the  Lagrangian  function  because  it  is  not  clear  what  meaning 
the  difference  of  energies  can  be  endowed  with.  The  situation  is 
clarified  by  a simple  example.  The  trajectory  of  a particle  with  given 
energy  E moving  in  a given  stationary  potential  field  U is  determined 

by  the  minimum  condition  J p • dr  along  the  path,  where  p = mx 

is  the  momentum  and  r is  the  radius  vector.  And  so  we  have 


Hence 


T — U = 2T  - (T  + U)  = 2T  — E 

*2  *2  *2  *2 

^Ldt  = ^2T  dt  — ^E  dt  = ^2Tdt  — E(t2  — *,) 


where  the  term  E(t2  — is  constant.  Writing  2T  — 2-~-  = 
= m\  • v,  v dt  = dry  we  then  find  that  the  extremum  condition 

^ L dt  coincides  with  the  extremum  condition  ^ p * dr  for  given  initial 

fi 

and  terminal  points  of  the  path  and  a given  energy.  Here,  p2  = 
= 2m  ( E — U ). 

At  th$  same  time  the  stationarity  condition  ^ p * dr  is  fully 

analogous  to  the  Fermat  principle.  The  point  is  that  in  quantum 
mechanics  the  wavelength  associated  with  the  motion  of  a body 
is  equal  to  X = 2nTilp,  where  Ti  is  Planck’s  constant,  Ti  ^ 10”27g-cm2/s. 
Here  the  probability  of  finding  a particle  at  a given  point  is  propor- 
tional to  the  square  of  the  amplitude  of  the  wave,  and  waves  that 
traverse  different  paths  combine.  It  is  precisely  the  condition  of 

stationarity  • dr  along  the  trajectory  computed  in  classical 

mechanics  that  is  the  condition  that  waves  with  the  same  phase 
traversing  paths  close  to  the  trajectory  are  additive. 

The  analogy  between  the  mechanical  principles  of  least  action 
and  Fermat's  principle  played  a big  role  in  the  development  of  quan- 
tum mechanics. 

There  are  many  other  variational  principles  in  diverse  fields, 
including  those  far  removed  from  physics,  such  as  economics.  In 
economic  problems  we  usually  have  certain  resources  (money,  mate- 
rials, etc.)  at  our  disposal  that  are  to  be  utilized  in  order  to  obtain 
maximum  usefulness.  What  we  have  is  a maximization  problem  in 
which  the  sought-for  element  is  an  optimal  plan  for  utilizing  resources 
and  maximizing  usefulness.  Depending  on  the  nature  of  the  problem. 
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tlio  choice  of  a plan  may  involve  either  a finite  number  of  degrees 
**i  freedom,  which  means  we  seek  a set  of  parameters  defining  the 
plan,  or  an  infinite  number  of  degrees  of  freedom,  so  that  what  we 
mv k is  a function.  In  the  former  case,  the  problem  is  sometimes 
■olved  with  the  tools  of  differential  calculus  (Secs.  4.6,  4.7)  and 
sometimes,  if  the  number  of  desired  quantities  and  the  restrictions 
imposed  on  them  are  great,  with  the  tools  of  a new  mathematical 
discipline  called  the  “theory  of  mathematical  programming”.  In  the 
latter  instance,  the  problem  belongs  to  the  calculus  of  variations. 
Similar  extremal  principles  are  constantly  being  used  in  our  activi- 
i ies,  although  we  are  not  always  aware  of  this  fact  nor  do  we  always 
apply  them  in  the  most  apt  fashion. 

Exercise 

Proceeding  from  the  expression  (11)  for  the  potential  energy 
of  a homogeneous  string  on  a cushion,  derive  the  equation  of 
motion  of  the  string  disregarding  the  kinetic  energy  of  the  cush- 
ion. 

12.12  Direct  methods 

For  a long  time  almost  the  only  way  to  solve  variational  prq- 
blems  was  to  pass  to  the  differential  equations  of  Euler.  However; 
the  solution  of  the  resulting  boundary- value  problems  for  the  differen- 
tial equations  very  often  turned  out  to  be  a complicated  matter. 
To  obtain  an  explicit  solution  in  quadratures  is  a rare  event  and 
one  has  to  make  use  of  approximate  methods.  Rarer  still  is  an  expli- 
cit solution  of  the  Euler  equations  for  variational  problems  involving 
several  independent  variables. 

Of  late,  frequent  use  is  made  of  a number  of  effective  approxi- 
mate methods  for  the  direct  solution  of  variational  problems ; these 
methods  do  not  involve  passing  to  differential  equations  and  go 
by  the  name  of  direct  methods  of  the  calculus  of  variations.  Most  of 
them  are  based  on  passing  to  extremum  problems  of  a function  of 
several  variables,  which  is  to  say,  to  problems  with  a finite  number  of 
degrees  of  freedom.  We  will  now  consider  two  such  methods. 

The  first  is  based  on  a process  which  is  the  reverse  of  that  describ- 
ed in  Sec.  12. 1,  where  we  passed  from  a problem  with  a finite  number 
of  degrees  of  freedom  to  the  variational  problem  by  means  of  refining 
the  partition  of  the  interval,  as  a result  of  which  the  sum  (3)  passed 
into  the  integral  (7).  Using  the  reverse  process,  we  can  pass  from  the 
integral  to  a sum  and  thus  reduce  the  problem  of  extremizing  a func- 
tional to  the  problem  of  extremizing  a function  of  several  variables. 

For  example,  suppose  we  are  considering  the  extremum  of  the 
functional  (25)  under  the  boundary  conditions  (26).  Partitioning  the 
interval  of  integration  into  n equal  subintervals  by  means  of  the 
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division  points  x0  = a,  xv  x2,  x„  = b and  setting  y(xf)  = yt, 
we  approximately  replace  = y‘*1  ~ ~ = - — -J : 

.v.  y') dx  * y*>  y'*> h 


(As  in  Secs.  1.1  and  2.2,  we  could  make  use  of  the  more  exact  formulas 
of  numerical  differentiation  and  integration,  but  we  will  not  dwell 
on  that  here.)  Since  the  values  y0  = ya  and  yn  = yb  are  given,  the 
problem  reduces  to  finding  the  extremum  of  the  function  of  n — 1 
variables  yv  y2 , yn_x  in  the  right  member  of  (73).  As  was  demons- 
trated in  Sec.  12.1,  this  problem  can  sometimes  be  carried  to  comple- 
tion with  relative  ease. 

One  of  the  most  common  direct  methods  of  the  calculus  of 
variations  is  the  so-called  Ritz  method * which  consists  in  the  fact 
that  the  desired  function  is  sought  in  a form  that  involves  several 
arbitrary  constants  (parameters).  For  example,  in  the  case  of  one 
independent  variable  in  the  form 

y — cp(#;  C1}  C2,  ...,  Cn)  (74) 

Here  the  right  member  is  chosen  so  that  the  given  boundary  condi- 
tions are  satisfied  for  any  values  of  the  parameters.  Substituting 
(74)  into  the  expression  for  the  given  functional,  we  find  that 
the  value  of  the  functional  turns  out  to  be  dependent  on  CA,  C2,  ...,  Cn . 
Thus,  the  problem  of  extremizing  the  functional  reduces  to  the 
problem  of  extremizing  a function  of  n independent  parameters 
Clf  C2,  ...,  Cn,  and  this  problem  can  be  solved  by  the  methods 
of  Ch.  4.  True,  the  solution  thus  obtained  is  only  an  approximate 
solution  of  the  original  problem,  since  by  far  not  every  function 
can  be  represented  in  the  form  (74).  The  larger  the  number  of  para- 
meters C{  that  have  been  introduced,  the  more  “flexible”  is  the 
formula  (74),  that  is,  the  more  exactly  we  can  represent  the  desired 
solution  by  this  formula,  but  the  more  complicated  are  the  compu- 
tations. For  this  reason,  in  practice,  the  number  of  such  parameters 
is  restricted  to  a few  and  at  times  it  is  sufficient  to  take  n = 1. 

* This  method  was  proposed  in  1908  by  the  German  physicist  and  mathematician 
F.  Ritz.  In  1915  the  Russian  mechanician  B.  G.  Galerkin  (1871  — 1945)  used  a 
more  general  method  that  is  applicable  to  boundary-value  problems  that  are  not 
necessarily  of  a variational  origin.  For  this  reason,  the  method  is  sometimes  called 
the  method  of  Ritz-G'alerkin. 
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Some  examples.  Suppose  we  seek  to  minimize  the  functional 

1 

I = J (/2  + y2)  dx  (75) 

0 

given  the  boundary  conditions 

M0)  = 0,  M0=1  (76) 

We  seek  an  approximate  solution  in  the  form 

y = * + Cx(  1 ~ x)  (77) 

(The  first  of  these  terms  satisfies  the  conditions  (76),  the  second, 
the  corresponding  homogeneous  conditions  that  continue  to  hold 
when  multiplied  by  an  arbitrary  constant,  so  that  the  entire  sum 
also  satisfies  the  conditions  (76) ; the  same  pattern  is  used  for  setting 
up  the  function  (74)  in  other  examples  as  well.)  Substituting  (77) 
into  (75)  yields,  after  calculations, 
i 

/ = J [(1  + c - 2 Cx)2  + (*  + Cx  — Cx2)2]  dx 
0 

= ( 1 + c2  + AC2  J + 2C  — 4C  ~ — AC2  -i) 

-f  (-  + C2  - + C2  - + 2C  - — 2C  - — 2C2  -) 

U 3 5 3 4 4 j 

= - + -C  + -C2 

3 6 30 

To  find  the  minimum  of  this  function  of  C,  equate  the  derivative  to 
zero: 

— + — C = 0 
6 15 

whence  C = ^ = — 0.227,  which  means  the  approximate  solu- 
tion (77)  is  of  the  form 

yx  = x — 0.227*(1  — x)  = 0.773*  + 0.227*2 

A more  exact  result  is  obtained  if  we  seek  the  approximate 
solution  in  the  form  y ~ x + Cxx  (1  — *)  + C2*2(  1 — *),  which 
includes  two  parameters.  Substituting  into  (75)  and  equating  to  zero 
the  derivatives  with  respect  to  Cx  and  C2  leads  (we  leave  it  up  to  the 
reader  to  carry  out  the  computations)  to  C,  = — 0.146,  C2  — 
= — 0.163,  that  is,  to  the  approximate  solution 

yn  = x — 0.146*(1  — *)  — 0.163*2(1  — *) 

= 0.854*  — 0.017*2  + 0.163*3 
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It  is  easy  to  obtain  an  exact  solution  here.  Indeed,  the  Euler 
equation  for  the  functional  (75)  is  of  the  form 

2y  — — (2/)  = °»  y"  —y  = o 

dx 

and  has  the  general  solution  (Sec.  7.3) 

y = Aex  + Be~x 

where  A and  B are  arbitrary  constants.  Under  the  conditions  (76) 
we  get 


This  exact  solution  can  be  compared  with  both  approximate  solutions 
(see  table) 


X 

y\ 

yn 

^exact 

X 

yi 

yn 

yexact 

0.0 

0.000 

0.000 

0.000 

0.6 

0.546 

0.541 

0.542 

0.1 

0.080 

0.085 

0.085 

0.7 

0.652 

0.645 

0.645 

0.2 

0.164 

0.171 

0.171 

0.8 

0.764 

0.756 

0.754 

0.3 

0.252 

0.259 

0.259 

0.9 

0.880 

0.874 

0.873 

0.4 

0.346 

0.349 

0.350 

1.0 

1.000 

1.000 

1.000 

0.5 

0.443 

0.443 

0.444 

Thus,  even  in  the  simplest  variant  the  Ritz  method  yields  a 
very  high  accuracy  in  this  case. 

The  Ritz  method  produces  still  higher  accuracy  if  it  is  required 
to  find  not  the  extremizing  function  but  the  extremal  value  itself 
of  the  functional.  True  enough,  for  a small  variation  of  the  function 
near  the  stationary  value  of  the  functional  leads  to  a still  smaller 
variation  of  the  value  of  the  functional.  (Compare  the  change  in  the 
independent  and  dependent  variables  near  the  stationary  value  of 
the  function,  for  instance,  the  change  in  x and  y near  the  value  x = c 
in  Fig.  180.)  For  this  reason,  the  error  in  the  value  of  the  functional 
will  be  of  higher  order  compared  with  the  error  in  the  extremizing 
function.  Thus,  in  the  foregoing  example,  when  we  substitute  the 
function  yt(x),  the  functional  (75)  produces  an  approximate  minimal 
value  of  1.3 14,  whereas  the  exact  value  is  1.313.  The  error  comes 
to  0.1%.  And  to  detect  the  error  when  substituting  yn  (x)  would 
require  computations  of  much  higher  accuracy. 

If  we  perform  similar  computations  for  the  functional 

i 

I = ^ (/*  -f  Xy 2)  dx 
6 


(78) 
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Direct  methods 


507 


(this  the  reader  can  do  for  himself),  then  for  just  about  the  same 
. i mount  of  computation  we  obtain  an  approximate  solution  with 
high  accuracy,  although  in  this  case  the  exact  solution  cannot  be 

• xpressed  in  terms  of  elementary  functions.  This  is  no  obstacle  to 

• lirect  methods. 

Here  is  another  example  of  an  extremal  problem  for  a function 
of  two  variables.  Let  it  be  required  to  find  the  extremum  of  the 
Junctional 

m i[(5),+(i),+ *•]*'*’  <79) 

-i  -i 

from  among  functions  that  vanish  on  the  boundary  of  a square 
hounded  by  the  straight  lines  x = ± 1,  y = ± l.Here,  the  Euler 
equation  cannot  be  solved  exactly  by  means  of  elementary  functions. 
We  seek  an  approximate  solution  by  the  Ritz  method  in  the  form 

n = C ( 1 — x2)  ( 1 — y 2) 

Substitution  into  (79)  gives 

i i 

/ = n {[  - 2Cx(l  - y2)]2  + [ - 2C>(1  - -OP 
-1  -1 

+ 2C(1  - x2)  (1  -y2)}dxdy 

+ 2C  • 2|l  — -ij  • 2|l  — ^)=^-C2  + ~C 

Equating  the  derivative  to  zero  produces  a minimum  at  C = 

= ~ Hence  the  solutions  of  the  minimum  problem  of  the  func- 
16 

tional  (79)  for  given  boundary  conditions  is  approximately  of  the  form 
u=-^{\-x2)(\-y2) 

A comparison  with  the  exact  formula,  which  has  the  form  of  an 
infinite  series,  shows  that  the  error  of  this  approximate  solution  is 
equal,  on  the  average,  to  1.5%;  the  error  in  the  value  of  the 
functional  I is  about  0.2%. 

Exercises 

1.  Find  an  approximate  solution  to  the  problem  of  the  minimum 
of  the  functional  (75)  under  the  conditions  (76),  choosing  the 
approximate  solution  in  the  form  y = x + C sin  r.x.  Compare 
the  resulting  values  of  I and  y (0.5)  with  the  exact  values. 
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2.  Find  an  approximate  solution  to  the  problem  of  the  minimum 
of  the  functional  (78)  under  the  conditions  (76),  choosing  the 
approximate  solution  in  the  form  (77). 

3.  Find  an  approximate  solution  of  the  problem  discussed  in  the 
text  on  the  minimum  of  the  functional  (79),  taking  the  approxi- 
mate solution  in  the  form  u = C cos  i kx  cos  y ny.  Compare 
the  resulting  values  of  I and  u (0,  0)  with  those  found  in  the  text. 


ANSWERS  AND  SOLUTIONS 

Sec.  12.1 

Reasoning  as  in  the  text,  we  find 

= [(*'  - 1)  Fi  + (i  - 2)  F2  + ...  + ivJ  + iyv 

o = - [(»  - 1)  ^1  + (n  - 2)  F2  + ...  + F,_J  + nyx 


whence 

k(n  — i) 


yt  = 


Pn 


[^i  + 2F2  + ...  + (i  — 1)  -Fi-J 


+ ^L[(n-i)Fi  + (n-i-  l)Ff+1  + ...  + 1F..J 

Pn 

Putting  n = —,  i = — , Ft  = f(x()  h,  h 0,  we  get,  in  the 
h h 

limit, 

* i 

l - X[ 


y = 


Sec.  12.2 


Pi 


' $ \m  dt+^l-l)  M)  dl  (cf.  Sec.  6.2) 


I i 

1.  87  = - ^^-dx,  2 y(0)  8y(0)  + J (x&y  + 2 y'ty')  dx. 

0 0 

1 1 

2.  In  this  example,  A7  = ^ ( lx  + ax2)2  dx  — ^ (2*)2  dx  = a 


+ -.  81  = a. 

T 5 

Computations: 

a A/  87  relative  error 

1 1.2  1 —17% 

—0.1  —0.098  —0.1  — 2% 

0.01  0.01002  0.01  — 0.2% 
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See.  12.3 

In  both  cases,  equation  (22)  is  of  the  form  (1  — x)  2 (y  — 2x)  — 0, 
whence  y = 2x.  This  solution  gives  the  functional  a stationary 
value  of  1 = 0,  which  is  minimal  only  in  the  case  (a).  In  the 
case  (b)  the  solution  has  the  nature  of  a minimax  (cf.  Sec.  4.6) ; 
the  minimum  problem  has  no  solution. 

Sec.  12.4 

1.  (a)  The  Euler  equation  has  the  form  2y  — — (2 yf)  = 0,  or 

dx 

y"  — y = 0,  whence  y = Cxex  -f-  C2e~x  (see  Sec.  7.3).  From 
the  boundary  conditions  we  get  0 = C1  + C2,  1 = Cxe  + C2e~1, 

1 1 ex  — e~x 

whence  Cx  = , C2  ~ and,  finally,  y = ; 

e — e~1  e — e 1 ~ e — e~l 

(b)  by  formula  (34),  yy’2  — 2 yy'  * y*  = C,  yy'2  = — C,  ]/ ydy  = 

= ± ]/  — C dx,  y312  = ± Cx  + C2  = Cxx  -j-  C2.  From 

the  boundary  conditions  p312  = C2 , q312  = Cx  -f-  C2,  whence 
^3/2  __  ^3/2  _ psi2>j  x _|_  ^3/2  anc^  finally,  y — [{q312  — p312)  x + 

p2  /2J2/3^ 

2.  Substitution  of  the  last  solution  into  I yields  (after  calculations) 

the  value  — ( q 3/2  — p312)2,  whence  — = — — p1,2(q312  — p312). 
9 dp  3 

By  the  formula  derived  in  the  text,  — = — (2yy')  = 

dp  * |*=0 

= — 2p-[(q312  — p3'2)  x + ^3/2j-i/3  (?3/2_^3/2)  |i=0  = _ 1 pm 

(q312  —p312),  which  is  the  same  thing. 

3.  Since  the  length  of  the  graph  y = f{x)  (a  ^ x ^ b)  is  equal  to 

b 

^yi+3''2  dx,  the  problem  reduces  to  determining  the  curve 

a 

that  minimizes  the  functional  under  the  conditions  (26).  From 
(34)  we  get  ^ 1 + y'2 2 = C,  whence  v'  = Cv  y = 

2y  1 + y'2 

= Cxx  + C2.  This  is  the  equation  of  the  straight  lines. 

Sec.  12.5 

From  the  Euler  equation  it  turns  out  that  for  a ^ 2i r,  3tt,  ... 
the  only  function  that  satisfies  the  boundary  conditions  and 

gives  the  functional  a stationary  value  is  y = 0.  Substituting 

az 

y = Cx{a  — x),  we  get  I = C2  — (10  — a2),  whence  we  see 
that  for  a>|/l0  the  minimum  problem  has  no  solution. 
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Substituting  y = C sin  — , we  get  I = — (rc2  — a2),  whence  il 

a a 

is  apparent  that  for  a > tz  the  minimum  problem  does  not 
have  a solution.  It  can  be  demonstrated  that  such  will  be  the 
case  for  a = 2n,  37u,  ...  as  well,  and  for  a < n the  function y = 0 
serves  as  a solution  to  the  minimum  problem. 

Sec.  12.6 

1.  As  for  (27),  we  arrive  at  the  equation 

b 

[ (Fy8y  + F'y,$y'  + Fy„$y")  dx  = 0.  Integrating  the  second  term 


by  parts  once  and  the  third  twice,  we  get 

b 


Sfi-irr  + £rr.)>yix-o 


2. 


whence  we  arrive  at  the  desired  equation: 

*i -hr* + ° 

For  brevity,  set  z'x  = p,  zfy  — q ; we  get  the  equation 

dz  -rtr 


77'  77"  . 77" 

± z x %>x  pz 


dx 


— Fr 


dx2 


O 77"  77"  77" 

Z^p?77  F qy  r qz  — 

dxoy  dy 


Ff 

■*-  a 


Sec.  12.7 


d2z 

dy2 


= 0 


(a)  u*  ==  v2  — y2  + z2  — 2x  — X(#  + 2 y — z),  whence  the  sta- 
tionarity  conditions  yield 


2x  — 2 — X = 0,  — 2y  — 2X  = 0,  2z  + X = 0 

Then,  expressing  x,  y,  z in  terms  of  X and  substituting  into  the 
constraint  equation,  we  get  X = — 2,  whence  the  coordinates 
of  the  point  of  conditional  extremum  are  x = 0,  y = 2,  £ = 1 ; 
(b)  u*  = x2  — y2  + 22  — 2x  — Xx  (x  + y — z)  — X2(x  + 2 y).  From 
this  2x  — 2 — \ — X2  = 0,  — 2y  — \ — 2X2  = 0,  2z  + X2  = 0. 

1 13 

Invoking  the  constraint  equations,  we  get  # = — , y = — , z = — . 

”244 

Sec.  12.8 

1.  Let  OX  be  the  axis  of  rotation,  and  y = y(x)  ^ 0 (a  < x ^ 5) 
the  equation  of  the  contour  of  the  axial  cross  section ; then  the 
volume  and  the  surface  of  the  solid  of  revolution  are  ex- 

b 

pressed  by  the  formulas  (see  HM,  Sec.  4.7)  V = r.  C y2  dx, 
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U 

S = 2n  f 1 + y'2  dx.  The  intermediate  integral  (34)  for  the 

a 

function  n (y2  — 2)y  ]/ 1 + y'2)  is  of  the  form  ~(y2  — 2\y^  1 +>''")  + 

-f  2fkyy —y’  = Cv  or  7c  (y2 = ) = Cv  Since  y can  be  zero, 

" l V t + y'v 

= = 0,  whence,  integrat- 


2. 


2. 


V t + y"‘ 

and  we  get  y — 


2X 


n + y 

then  C,  = 0 — ... 

ing,  we  find  (x  + C)2  + y2  — 4X2.  The  desired  solid  is  a sphere. 
Since  the  length  I of  a suspended  string  (L)  with  equation 
y ~ y(x)  is  given  and  the  height  of  the  centre  of  gravity,  as 
follows  from  Sec.  11.2,  is  determined  by  the  formula  ycg>  = 

he  problem  reduces  to  minimizing  the  integral 
B = ^ y dL  = ^y  ^1  + y'2  dx  for  a given  L ~ ^ )f  1 + y,2dx . 

(L)  ( L ) (L)  

The  Euler  equation  has  to  be  written  for  the  function  y ]/ 1 + y'2  — 
— X]/ 1 + y'2  = (y  — X)  y 1 + y'2>  which,  after  the  substitution 
y — X = y3  goes  into  the  function  investigated  in  the  exam- 
ple of  Sec.  12.5.  The  solution  of  the  problem  is  a catenary 

with  equation  y = — [gM*+c)  x,  where  k,  C , X 


3. 


Sec. 

1. 


2 R 


where  R is  the  radius  of  a 


2k 

are  arbitrary  constants. 

In  the  first  example,  X = 

sphere.  In  the  second  example,  X is  equal  to  the  difference  between 
the  ordinate  of  the  extreme  lower  point  of  the  catenary  and 
the  radius  of  curvature  at  this  point. 

12.9 

The  function  z has  a stationary  value  z = 0 at  the  point  > — j • 
On  the  legs  of  the  triangle  there  is  one  conditional  stationary 
value  z — ~ at  the  point  ^0,  j and  another  one  z = — at 

the  point  > oj.  At  the  vertices  the  values  are  0,  — 2,  and  2. 

Hence,  = 2 is  attained  at  the  point  (2,  0),  zmln  = — 2 at 
the  point  (0,  2). 

In  Exercise  2 of  Sec.  12.8  we  obtained  the  equation  of  the  hanging 
portions  of  the  string.  Two  cases  are  possible  here.  If  the  string 
is  sufficiently  short,  L < L0t  so  that  it  does  not  reach  the 

i.e.  the  straight  line  y ==  0,  we  get  y = 


ekx  e-kx 


‘floor 

ek  -f  g-fe 


2k 


2k 
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Here,  L = — — e—  so  that  for  a given  L it  is  possible  to  find  k 

k 

numerically. 


2 + 2k  — ek  — e—k 
2k  J 


it  follows  that  L0  is  deter- 


Since  ymin  = 

mined  from  the  k > 0 for  which  2 + 2k  — ek  + e~k-  For  L > L0 , 
the  curve  consists  of  two  hanging  portions  and  one  tying  on 

the  floor.  The  right  hanging  portion  has  the  equation  y — + 

e-k(x-C)  l ^ft(l-C)  + e-k{l -C)  l 

H and  ' = 1,  whence  we  cah 

2k  k>  ana  2k  k 

express  C in  terms  of  k.  The  entire  curve  is  of  length  L = 2C  + 

ek(l-C)  _ e-k{\ -C) 

H ~ 


Sec.  12.10 

1 . The  equation  of  the  ellipse  is  ^ (x  + c)2  + y2  + ]/(x  — c)2  + V2  = 

= L ; after  manipulations,  we  have  4(L2  — 4c2) x2  ■ )-  4L2y2  — 
= L2(L2  — 4c2).  The  equations  of  the  tangent  and  the  normal 
at  an  arbitrary  point  ( x0 , yQ)  are,  respectively. 


(L2  - 4c2)  x„  . . 

To  = - 1 (x  - x0) 


L2y o 


and  y — y0  = 


L2y0 


(L2  - 4c2)  x0 


{ x — x0) 


The  desired  angles  are  found  from  the  formula  tan  (p  — a)  --- 

_ tan  p — tan  . por  |]le  first  angle  we  have  to  put  tan  * = 
1 + tan  a tan  p 

= — — — , tan  p = ^2. ^ whence,  after  some  manipulat- 

x0  + o (L2  — 4c2)  x0 

ing,  we  get  tan  ((3  — a)  = — — . Such  also  is  the  tangent 

of  the  second  angle,  whence  follows  the  last  assertion  in 
Exercise  1. 

2.  If  the  point  N has  coordinates  (0,  yN),  then  L = Y — yK  + 
+ VXK  + (y~K  — yN)2  = Y — x\  + ^x2k  + {x2k  — yN)2.  From  the 
condition  dLjdx  = 0 we  get  yN  — — . The  equations  of  the  tan- 

gent  and  the  normal  at  the  point  K are,  respectively,  of  the  form 

y xk~  2xk(x  x}<)  and  y x\  = - (x  xK) 

lxK 

and  the  equation  of  the  straight  line  NK  has  the  form  r — 
x*  — 1/4 

— x\  = — (r  — xK).  The  equality  of  the  required  angles 

XK 

is  proved  as  in  Exercise  1. 
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3.  Let  the  points  S and  A have  the  coordinates  (—  1,  1)  and  (1,  1). 
Then  L = ]j(x  + l)2  + 1 + ]f{x  — l)2  + 1,  x0  = 0,  L"  (x0)  = 
= -y=-  cm,  whence  the  reflecting  region  has  diameter  equal  to 


2 ][—  10^-  - 1.7-  10 
F 1//2 


Sec.  12.11 

In  the  given  example. 


"2  cm  = 0.17  mm. 


L-r-u=iW%  p*  - $ [?(!)’ + M >] dx 
0 0 

Writing  down  the  action  integral  and  then  applying  the  Euler 
equation  of  Sec.  12.6,  we  get  the  desired  equation  in  the  form 

-ky+f{x)-±[9  *)  + -I(p  *1=0,  that  is,-^  = P a2y 

J yw  dty  dt)  dx[  dx)  dt2 


k 

y 

9 

Sec.  12.12 
1. 

, 7T2  + 1 


-M- 

9 


p dx 3 


2. 


4 2 

Substituting  y = x + C sin  kx  into  (75),  we  get  I = — | — C + 

3 7T 

C2.  From  the  condition  — ==  0 we  find  C = — — — = 
2 dC  ^(tt2  + 1) 

= — 0.0588.  The  values  of  I andy(0.5)  turn  out  equal  to  1.315 
(error:  0.2%)  and  0.441  (error:  0.7%). 

5 C 7C2 

Substituting,  we  get  I = 7 + ^ + ' 


From  the  minimality  condition  we  have  C = 
= 1.247. 


1 T — — — 
1*  ~ 70  _ 


yiQ 

3.  Substituting,  we  get  I = — 1 -.  From  the  minimality 

2 7I2 

32 

condition  we  have  C — 0.331.  The  values  of  I and 

7T4 

u( 0,  0)  come  out  equal  to  — 0.537  and  —0.331  instead  of  —0.556 
and  -0.312. 
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Chapter  13 

THEORY  OF  PROBABILITY 


13.1  Statement  of  the  problem 

In  nature,  engineering  and  elsewhere  we  fre- 
quently encounter  phenomena  whose  study  re- 
quires knowledge  of  a special  branch  of  mathe- 
matics called  the  theory  of  probability. 

The  most  elementary  example  and  the  one 
almost  invariably  used  in  this  connection  is  coin 
tossing.  Flip  a coin  and  it  will  fall  heads  or  tails. 
If  a coin  is  flipped  so  that  it  turns  several  times 
in  the  air,  it  will,  on  the  average,  come  up  heads 
the  same  number  of  times  as  it  does  tails.  A 
coin  tossed  a very  large  number  of  times  N 

will  fall  approximately  |y j N times  heads  and  |yj  N times  tails. 
The  factors  of  N are  the  same,  equal  to  y , both  for  the  case  of  heads 


and  for  the  case  of  tails.  They  are  called  probabilities  in  this  case. 
We  say  that  in  a single  tossing  of  a coin  the  probability  of  falling 

heads,  wb,  is  equal  to  y , the  probability  of  falling  tails,  wu  is 


also  equal  to  y ■ 


Let  us  take  another  instance.  Consider  a die,  one  face  of  which 
is  white  and  the  others  black.  If  the  die  is  thrown  a large  number 
of  times  N,  we  approximately  get  (l/6)iV  times  white  and  (5/6) A 
times  black  (it  is  assumed  that  the  die  is  made  of  homogeneous  ma- 
terial). Here  we  say  that  the  probability  of  the  white  face  turning 
up  is  ze\vh  = 1/6  and  the  probability  of  a black  face  turning  up  is 
wb  = 5/6.  It  is  clear  that  if  there  are  two  mutually  exclusive  out- 
comes in  a separate  trial  (heads  or  tails  in  the  first  example,  and 
a white  or  black  face  in  the  second  one),  then  the  sum  of  their 
probabilities  is  1. 

Let  us  examine  the  question  of  radioactive  decay  (this  was  con- 
sidered in  some  detail  in  HM,  Ch.  5).  We  intend  to  observe  a sepa- 
rate atom  of  the  radioactive  material.  The  probability  that  the 
atom  will  decay  in  a small  time  t that  has  elapsed  from  the  start 
of  observations  is  equal  to  w-y  where  w is  a constant,  which  means 
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(lie  probability  is  proportional  to  the  time  interval  t.  And  hence 
(he  probability  that  radioactive  decay  during  this  time  will  not 
occur  is  equal  to  1 — wr.  The  constant  w characterizes  the  given 
radioactive  substance.  It  has  the  dimensions  of  1/s  and  is  connected 

with  the  mean  lifetime  T of  the  given  element  by  the  relation  w — 

Clearly,  the  probability  of  decay  may  be  equal  to  wt  only  for 
very  small  intervals  of  time  t.  Indeed,  for  large  t we  can  obtain, 
say,  w-  > 1,  which  is  obviously  meaningless.  And  so  we  have  to 
consider  an  extremely  small  time  interval  dt.  Then  we  will  be  dealing 
with  a very  small  probability  of  decay  w dt  and  with  a probability 
extremely  close  to  ( 1 — w dt)  that  the  atom  will  not  experience  de- 
cay at  all.  From  this,  we  will  get,  in  Sec.  13.5,  the  probability 
a of  not  decaying  in  a finite  time  t,  a = (1  — w dt)^dt  = e~wt  and  the 
probability  6 of  decaying  in  this  time,  6=1—  a = 1 — e~wt. 

There  are  cases  where  several  (more  than  two)  outcomes  of  a 
single  trial  are  possible.  For  instance,  in  throwing  an  ordinary  die 
(a  cube  with  numbers  ranging  from  1 to  6 on  the  faces)  we  have 
six  possible  outcomes:  1,  2,  3,  4,  5,  or  6 turning  up.  The  proba- 
bility of  each  outcome  is  equal  to  1/6. 

Finally,  the  result  of  a separate  trial  may  be  described  by  a 
quantity  that  assumes  a continuous  sequence  of  values.  Take,  for 
example,  the  case  of  fishing  with  a rod ; we  let  p describe  the  weight 
of  each  fish  caught.  We  can  divide  the  fish  roughly  into  three  clas- 
ses: small  (up  to  100  grams),  medium  (from  100  grams  to  1 kilogram), 
and  large  (exceeding  1 kg).  Then  the  probable  outcome  of  catching 
a single  fish  will  be  described  by  three  numbers:  the  probability  of 
catching  a small  fish  wsm,  a medium  fish  wme and  a big  fish, 
^big.  Then 

^sra  + ^med  + ^big  = 1 

Of  course  this  is  a very  crude  description  of  fishing.  For  example, 
“catching  a big  fish"  may  mean  one  weighing  1.1  kg  or  20  kg. 

The  probability  of  catching  a fish  weighing  between  p and 
p + dp,  wdiere  dp  is  a very  small  increment  in  weight,  will  be  de- 
noted by  dw . This  probability  is  naturally  proportional  to  dp.  Be- 
sides, dw  depends  on  p as  well. 

Indeed,  there  are  no  grounds  to  assume  that  the  probability 
of  catching  a fish  weighing  between  100  and  110  grams  is  the  same 
as  that  of  catching  a fish  weighing  between  1000  and  1010  grams. 
We  therefore  put 

dw  = F(p)  dp 

The  function  F{p)  here  is  called  the  probability  distribution  function. 
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We  know  that  the  sum  of  all  the  probabilities  of  catching  a 
fish  of  one  kind  or  another  is  equal  to  1,  which  yields 

00 

$ F{p)  dp  = 1 

0 

We  can  modify  the  problem  and  instead  of  regarding  a catch 
as  a separate  trial  we  can  take  casting  the  line.  The  result  is  either 
a catch  or  (more  often  unfortunately)  a miss.  We  can  introduce 
the  probability  of  pulling  in  an  empty  line  k and  the  probability 
of  catching  a fish  (of  any  weight)  equal  to  1 — k,  and  then  sub- 
divide the  cases  of  catching  one  or  another  type  of  fish  in  accordance 
with  the  function  F(p).  Alternatively,  we  might  regard  an  empty 
line  as  a “catch  of  weight  zero”,  thus  relieving  us  of  saying  no 
fish  were  caught.  Then  we  have  a new  probability  distribution  func- 
tion f(p).  This  will  be  a function  for  which  zero  is  a singular  point. 
00 

The  integral  ^f(P)  dp  = 1,  but  the  point  p ~ 0 will  make  a finite 
o 

contribution  to  this  integral.  This  means  that  f(p)  contains  a term 
of  the  form  k§(p)  (see  the  delta  function  discussed  in  Ch.  6).  The 
new  function  f{p)y  which  refers  to  a single  throw  of  the  line,  is  con- 
nected with  the  old  function  F{p)  (which  referred  to  a single  real  — 
with  p > 0 — caught  fish)  by  the  relation 

AP)  = km  + {\-k)F{p) 

The  foregoing  examples  illustrate  the  material  dealt  with  by 
the  theory  of  probability.  We  now  consider  certain  questions  that 
arise  in  this  connection. 

For  example,  let  a coin  be  tossed  five  times.  What  is  the  proba- 
bility that  it  will  fall  heads  in  all  five  cases?  What  is  the  probabi- 
lity that  it  will  fall  heads  in  four  cases  and  tails  in  one,  the  order 
in  which  they  appear  being  immaterial?  Here  we  assume  that  the 

probability  of  obtaining  heads  in  a single  toss  is  equal  to  —■ . The 

solution  of  problems  of  this  type  requires  a certain  amount  of  mani- 
pulative skill,  and  for  very  large  numbers  of  tosses  one  has  to  re- 
sort to  methods  of  higher  mathematics,  for  otherwise  the  computa- 
tions are  practically  impossible  to  carry  out.  The  next  range  of  ques- 
tions has  to  do  with  radioactive  decay.  Starting  with  the  probabi- 
lity of  decay  in  time  dt,  it  is  required  to  find  the  probability  of 
decay  in  time  t , which  may  be  of  any  duration  not  necessarily 
small.  For  this  we  will  need  the  apparatus  of  higher  mathematics. 
(In  particular,  we  will  obtain  a number  of  the  results  given  in  HM„ 
Ch.  5,  although  the  reasoning  will  be  somewhat  different.) 
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In  Ch.  5 of  HM  we  considered  a large  number  of  atoms  N 
and  obtained  a law  of  the  variation  of  N with  time.  Actually,  we 
considered  only  mean  values  there.  The  new  approach  will  enable 
us  to  solve  much  more  subtle  problems  such  as:  What  is  the 
probability  of  observing,  in  a certain  instrument,  a number  of 
disintegrations  that  differs  from  the  mean? 

In  the  fishing  problem,  we  can  pose  the  question  of  the  proba- 
bility of  obtaining  a certain  overall  catch  as  the  result  of  catching 
2,  3,  n fish  or  as  the  result  of  casting  the  line  2,  3,  n times. 
The  latter  problem  calls  for  rather  involved  mathematical  tech- 
niques. 

In  deriving  the  laws  of  probability  theory  we  will  need  the 

oo 

Stirling  formula  (Sec.  3.3),  and  also  the  formula  ^e^x2dx  — ]/r: 

— 00 

that  was  derived  in  Sec.  4.7. 

Exercise 

In  the  card  game  of  “dunce'-  we  use  a pack  of  36  cards  of 
four  suits.  What  is  the  probability  of  the  first  card  dealt  being 
a spade?  a queen?  a queen  of  spades?  the  trump  card  (the 
trump  card  is  the  suit  of  the  first  card  obtained  after  dealing)  ? 

13.2  Multiplication  of  probabilities 

The  basis  for  solving  the  problems  posed  in  Sec.  13.1  is  the 
law  of  compound  probabilities  of  independent  events. 

This  can  be  illustrated  in  the  following  simple  example.  Sup- 
pose a die  has  one  white  face  and  five  black  faces  so  that  the  pro- 
bability of  a white  face  appearing  is  =1/6,  the  probability  of 
a black  face  turning  up  is  wh  = 5/6.  Suppose  we  have  another  die 
with  two  faces  green  and  four  red.  For  this  die,  the  probabilities 
of  a green  face  and  a red  face  turning  up  are,  respectively,  equal  to 


Put  both  dice  in  a glass,  juggle  them  and  then  throw  them  on 
the  table.  Four  distinct  outcomes  are  possible:  one  white,  the  other 
green,  one  white,  the  other  red,  one  black,  the  other  green,  and, 
finally,  one  black  and  the  other  red.  These  outcomes  will  be  denoted 
as:  wg,  wr,  bg,  br.  The  number  of  cases  in  which  these  outcomes 
are  realized  are  denoted  by  ATwg,  Nwr,  Nbg>  Nbr,  and  the  correspond- 
ing probabilities,  by  wwg,  wWT,  whgt  wbr.  We  now  pose  the  problem 
of  determining  these  probabilities. 

Suppose  we  have  carried  out  a very  large  number  of  trials  N. 
We  divide  them  into  two  groups:  those  in  which  the  first  die  ex- 
hibits a white  face  (w)  (irrespective  of  the  colour  of  the  second  die) 
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and  those  in  which  the  first  die  exhibits  a black  face  (b)  (irrespective 
of  the  colour  of  the  face  that  turns  up  on  the  second  die).  Thus, 

N = iVw  + Nb.  Since  ww  = , it  follows  that  Nw  = u\v  • N = — N. 

N 6 

Similarly,  Nh  = wh  ■ N = — N. 

6 

On  the  other  hand,  it  is  clear  that  JVW  = iVwg  + A\vr.  We  will 
now  need  the  notion  of  the  independence  of  events.  We  assume  that 
the  fact  that  the  first  die  turned  up  white  has  no  effect  on  the 
probability  of  green  or  red  turning  up  on  the  second  die.  In  other 
words  we  consider  the  two  events  — a definite  colour  turning  up 
on  the  first  die  and  a definite  colour  turning  up  on  the  second  die 
— to  be  independent.  The  outcome  of  one  of  these  events  cannot 
in  any  way  affect  the  outcome  of  the  other.  For  this  reason,  the 

probability  of  green  turning  up  on  the  second  die  is  wg  — , while 

JVw 

the  probability  of  red  turning  up  on  the  second  die  is  wt  = • 

N vi 

From  this  we  find  2Vwg  = wg  • Nw  = we  • u\v  • N.  On  the  other  hand, 

wwg  = and  so 
5 N 

Wwg  = Ww  * Wg 

Thus,  the  probability  of  a compound  event  (white  turning  up 
on  one  die  and  green  on  the  other  one)  is  equal  to  the  product 
of  the  probabilities  of  simple  independent  events. 

It  is  now  clear  that 

Z£\vr  = Wyt  ’ WXi  Whg  = Wb  * Wg,  W'bi  = tt’b  * 

We  have  already  said  that  only  the  following  four  outcomes 
are  possible  in  a simultaneous  throw  of  the  two  dice  we  have  been 
discussing:  wg,  wr,  bg,  br. 

And  so  we  must  have 

Wwg  + Wwr  + Wbg  + Wbz  = 1 

It  is  easy  to  see  that  this  is  so,  for 

Wwg  + Wwr  + ^bg  + Wbr  = Wy,  • Wg  + Ww  • Wr  + Wh  * ZVg  + Wh  • Wt 

= Ww  • (we+  wT)  + wb  * (ws  + wx) 

= (z^g  + Z^r)  (Ww  + Wb) 

Each  of  the  last  two  parentheses  contains  the  sum  of  the  pro- 
babilities of  two  simple  events,  each  of  which  precludes  the  other 
and  some  one  of  which  must  definitely  occur.  (If  white  turns  up, 
then  black  cannot  turn  up,  and  vice  versa.  And  either  black  or  white 
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will  turn  up  definitely.)  It  is  clear  that  such  a sum  of  probabili- 
ties is  equal  to  unity: 

Ww  + Zfc’b  = 1 , S£’g  ~r  Wr  = 1 

whence  it  follows  that 

ze^wg  + ZPwr  + ^bg  + ^'br  = 1 

Now  let  us  take  up  the  next  example.  Suppose  we  have  a die 
with  a certain  number  (precisely  what  number  is  immaterial)  of  white 
faces  and  a certain  number  of  black  faces.  We  denote  the  probability 
of  white  and  black  occuring  in  one  throw  of  the  die  by  a and  (3 
respectively.  Suppose  an  experiment  consists  in  three  throws  of  the 
die.  Since  there  are  two  possible  outcomes  in  each  throw,  there 
will  be  23  = 8 distinct  possible  outcomes  in  the  experiment.* 

We  enumerate  them  in  the  array  given  below. 

First  Second  Third  Outcome  of  Probability 

throw  throw  throw  three  throws  of  outcome 

lw,  2w,  3w  a3 

lw,  2w,  3b  a2|3 

lw,  2b,  3w  a2(3 

lw,  2b,  3b  a fi2 

w lb,  2w,  3w  a2(3 

b lb,  2w,  3b  a[32 

w lb,  2b,  3w  aP2 

b lb,  2b,  3b  £3 

The  arrows  indicate  all  eight  possible  outcomes.  In  the  second 
to  the  last  column,  the  numeral  indicates  the  number  of  the  throw 
and  the  letter  the  outcome  of  that  throw.  For  example,  lb,  2w, 
3b  means  black  in  the  first  throw,  white  in  the  second,  and  black 
again  in  the  third.  The  probabilities  of  each  of  these  eight  outcomes 
are  given  in  the  last  column.  These  probabilities  are  readily  computed 
by  the  law  of  compound  probabilities. 

In  the  foregoing  table  we  distinguish,  say,  between  the  cases 
lw,  2w,  3b  and  -lb,  2w,  3w;  in  both  cases  there  was  one  black 
face  and  two  whites,  only  in  the  former  case  the  black  face  turned 
up  in  the  third  throw  and  in  the  latter  case  it  turned  up  in  the 
first  throw.  Ordinarily,  in  specific  problems  of  this  type  we  are  only 
interested  in  the  total  number  of  occurrences  of  a white  face  and 
the  total  number  of  occurrences  of  black,  the  order  of  the  occurrences 
being  immaterial. 


If  the  experiment  consists  in  throwing  a die  n times,  then  there  are  2n  possible 
distinct  outcomes. 
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From  this  standpoint,  the  eight  cases  considered  above  fall 
into  four  groups: 

w = 3,  b = 0,  w = 2,  b = 1,  w = 1,  b = 2,  w = 0,  b = 3* 

This  notation  is  clear  enough:  w = 2,  b = 1 stands  for  a group 
consisting  of  all  cases  in  which  white  turned  up  twice  and  black 
once.  (The  order  of  occurrence  is  immaterial.)  Now  let  us  set  up 
a table  indicating  all  the  groups,  all  cases  into  which  each  group 
falls,  the  probabilities  of  each  case,  and  the  probabilities  of  each  group. 

To  compute  the  probability  of  the  group,  note  the  following. 

If  some  group  of  events  combines  several  disjoint  cases,  then 
the  probability  of  the  group,  that  is,  the  probability  of  some  one 
case  occurring,  is  equal  to  the  sum  of  the  probabilities  of  the  inde- 
pendent cases  that  constitute  the  group.  An  example  will  illustrate 
this.  Suppose  we  toss  a die  with  white,  black,  and  red  faces,  the 
probabilities  of  occurrence  of  the  faces  of  these  colours  being  equal, 
respectively,  to 

Wsv,  Whi  Wr  (Ww  + Wh  + Wr  = 1) 


Consider  a group  of  events  consisting  in  the  fact  that  either  a white 
or  a black  face  turns  up.  We  denote  the  probability  of  this  group 
by  Z0w+b.  Let  us  find  b* 

Since  <x\v  = , wb  — , then  it  follows  that  Nw  = ww  • N, 

N N 

Nb  = wb  - N,  where  N is  the  total  number  of  throws  of  the  die.  It 
is  clear  that 


Ww- fb  = 


Nv 


N h 


• N 4-  wb  • N 
N 


= Ww  + Wh 


And  so  ww+ b = ww  + wb. 

We  now  have  the  following  table: 


Group 

Case 

Probability 
of  case 

Probability 
of  group 

II 

OA 

a4 

II 

o 

lvv,  2w,  3w 

a3 

a3 

lw,  2w,  3b 

a2P 

II 

rO 

II 

lw,  2b,  3w 

a2P 

3a2p 

lb,  2w,  3w 

a2(3 

lw,  2b,  3b 

ap2 

w — 1,  b = 2 

lb,  2w,  3b 

ap2 

3aP2 

lb,  2b,  3w 

ap2 

w = 0,  b = 3 

lb,  2b,  3b 

P3 

P3 

If  the  trial  consists  in  tossing  a die  n times,  then  the  2n  cases  fall  into  » + 1 
groups : 

w — n,  b = 0 ; w — n — 1,  b — 1 ; w = n — 2,  b = 2;  ; w = 0,  b = n 
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It  is  easy  to  see  that  in  n throws  of  a die,  the  probability  of 
white  occurring  m times  and  black  k times  (m  + k = n)  for  the  given 
law  of  alternation  of  white  and  black  faces  is  equal  to  amp*.  The 
group  w = m,  b = k includes  all  cases  in  which  white  occurred  m 
times  and  black  occurred  k times  with  different  order  of  alternation 
of  white  and  black. 

The  probability  of  the  group  w = m,  b = k is  equal  to  the  term 
in  the  binomial  expansion 

(«  + p)n  = a"  + na"-1  p + —7-7.  a”-2?2  + - + P" 

that  contains  the  factor  a™pfc,  which  means  it  is  equal  to  — ampfc. 

m ! k ! 

Indeed,  what  is  the  coefficient  of  amp*  in  the  binomial  expansion 
of  (a  + p)n?  It  is  the  number  of  ways  of  taking  m factors  a and  k 
factors  p by  multiplying  out  the  product 

(a  + p)  (a+p)  (a+P)...  (a  + p) 

" — " 

n times 

In  exactly  the  same  way,  the  number  of  cases  in  a group  is  the  num- 
ber of  ways  of  obtaining  m white  faces  and  k black  faces  by  alternat- 
ing white  and  black  faces  in  different  ways.  Consequently,  the  coef- 
ficient of  amp*  in  the  expansion  of  (a  + P)n  by  the  binomial  theorem 
is  equal  to  the  number  of  cases  in  the  group.  The  probability  of  each 
definite  case  w = m,  b = k is  equal  to  amp*  by  the  law  of  compound 
probabilities.  Therefore,  the  probability  of  a group  is  equal  to  the 
corresponding  term  in  the  binomial  expansion. 

It  is  clear  that  the  sum  of  the  probabilities  of  all  groups  must 
be  equal  to  1,  since  all  groups  embrace  all  possible  cases.  Let  us 
verify  this. 

Since  a + p = 1,  it  follows  that  (a  + p)n  = 1.  On  the  other 
hand, 

(a  + p)n  = a”  + rca"-1  p -\ — a”’1 32  -f  ...  + 

(n  — 2) ! 2 ! 

= w(w  = n,  b = 0)  + w( w = n — 1,  b = 1) 

+ w( w = n — 2,  b = 2)  -f  ...  -j-  w{ w = 0,  b = n)* 

Thus 

w(w  = n,  b = 0)  + w(w  = n — 1,  b = 1)  -f- 

...  + w(\v  = 0,  b = n)  = 1 


The  notation  w( w = m,  b = k)  stands  for  the  probability  of  the  group  in  which 
white  turns  up  m times,  and  black  k times. 
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Exercises 

1.  What  is  the  probability  that  in  two  tosses  of  a coin  heads  will 
come  up  twice? 

2.  A coin  is  tossed  three  times.  What  is  the  probability  that  it 
will  come  up  heads  in  all  three  cases  ? twice  heads  and  once  tails  ? 

3.  Four  faces  of  a die  are  black  and  two  white.  What  is  the  pro- 
bability that  in  two  throws  white  will  occur  twice  and  black 
twice?  white  once  and  black  once? 

4.  The  die  in  Exercise  3 is  thrown  three  times.  What  is  the  probabi- 
lity that  white  will  turn  up  twice  and  black  once  ? black  twice  and 
white  once? 

5.  A gun  is  fired  at  a target.  In  each  shot,  the  probability  of  a hit 
is  0.1,  of  a miss,  0.9.  Two  shots  are  fired.  What  is  the  probability 
that  one  is  a hit  and  the  other  a miss? 

6.  Under  the  conditions  of  Exercise  5 three  shots  are  fired.  What 
is  the  probability  that  one  is  a hit  and  two  are  misses  ? two  are 
hits  and  one  is  a miss? 

7.  The  situation  is  as  in  Exercise  5.  What  is  the  probability  of 
hitting  the  target  once  if  four  shots  are  fired  ? if  five  shots 
are  fired? 

13.3  Analysing  the  results  of  many  trials 

In  the  preceding  we  obtained  general  formulas  for  the  case  of  n 
identical  trials  in  each  of  which  twTo  distinct  outcomes  are  possible 
with  probabilities  a and  J3,  respectively. 

For  small  values  of  n these  formulas  are  sufficiently  simple  and 
pictorial. 

But  if  n is  great,  the  formulas  cease  to  be  pictorial.  Extra  work 
has  to  be  done  to  cast  the  formulas  in  a convenient  and  clear-cut 
form.  Only  after  such  a strenuous  job  is  it  possible  to  extract  the 
precious  grains  from  the  rock*  and  polish  them  into  brilliant  results 
that  will  call  forth  further  generalizations. 

Let  us  first  consider  the  simplest  case  of  a = (3  = — ■ , corres- 
ponding to,  say,  flipping  a coin  or  throwing  a die  with  three  white 
and  three  black  faces. 

Suppose  we  take  n = 100.  It  is  clear  that  the  probability  of 

obtaining  100  heads  (or  100  tails)  in  100  tosses  of  a coin  is  extremely 

(1 V00  1 1 

— = « . If 

2 ) 2i°°  1030 

a machine  performs  100  tosses  of  the  coin  per  second,  it  will  require 
on  the  average  of  1028  seconds  « 3 • 1020  years  to  obtain  a single  case  of 


As  the  poet  Mayakovsky  might  have  put  it:  ''Science  is  like  extracting  radium, 
with  one  gram  of  the  precious  mineral  to  show  for  a year  of  arduous  labour—  a 
thousand  tons  of  verbal  ore  to  get  a single  unified  formula". 
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a run  of  100  heads.  It  is  quite  clear  that  in  the  majority  of  cases  100 
tosses  will  yield  roughly  50  heads  and  50  tails.  But  it  is  not  at  all 
obvious  what  the  probability  will  be  of  exactly  50  heads  and  50  tails 
in  a series  of  100  tosses.  Neither  is  it  clear  what  the  probability  will 
be  of  an  outcome  that  differs  from  the  average.  For  example,  what  is 
the  probability  of  55  heads  turning  up  and  45  tails,  or  60  heads  and 
40  tails? 

Let  us  try  to  answer  these  questions. 

To  start  with,  note  that  the  probability  of  every  specific  occur- 
rence of  50  tails  and  50  heads,  say,  in  alternation  is  equal  to  y 

J ^ 2 100  1 030 

as  was  established  in  Sec.  13.2.  Which  means  that  this  probability 
is  equal  to  the  probability  of  heads  turning  up  one  hundred  times 
in  a row.  The  considerably  greater  probability  of  the  occurrence  of 
50  heads  and  50  tails  (in  any  order)  is  due  to  the  fact  that  this  event 
(this  group),  as  was  pointed  out  in  Sec.  13.2,  consists  of  a very  large 
number  of  different  cases  of  different  alternations  of  heads  and  tails. 

In  Sec.  13.2  we  found  out  that  this  probability  is  equal  to 


m\k\ 

where  a is  the  probability  of  heads,  p is  the  probability  of  tails,  n 
is  the  number  of  tosses,  m is  the  number  of  occurrences  of  heads,  and  k 
is  the  number  of  occurrences  of  tails  (nt  + k ~ n).  We  will  consider 
an  even  number  n (in  our  case,  n = 100).  We  are  interested  in  the 

occurrence  of  — tails  and  — heads.  Therefore 
2 2 


n 


J 


n 


I 


ml  k\ 


and  the  probability  of  the  event  that  interests  us  is 


w 


For  large  n this  expression  is  terribly  unwieldy.  Let  us  simplify 
it  by  using  the  approximate  formula  of  Stirling  for  the  expression  n\ 
(see  Sec.  3.3). 

By  Stirling's  formula  we  have 


(f) ! = 

and  so  we  get  

w (heads  = -j.  tails  =y)  = ]/-^ 
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For  n = 100  we  then  have  w = 0.08.  The  probabi- 

lity of  50  heads  and  50  tails  occurring  in  a definite  order  is  roughly 
equal  to  10-30.  And  so  the  number  of  distinct  cases  in  which  the 
same  result  is  realized  in  a different  order  is  equal  to 


0.03 

10-30 


= 8 • 1028 


Now  let  us  try  to  determine  the  probability  of  the  outcome  of 
a trial  that  differs  but  slightly  from  the  most  probable  outcome  (in 
the  example  at  hand,  the  most  probable  outcome  is  50  times  heads 
and  50  times  tails).  Denote  by  8 the  deviation  of  an  outcome  from 

the  most  probable  outcome,  8 = m — . For  example,  8 — 5 cor- 

responds to  55  heads  and  45  tails,  8 = — 5 corresponds  to  45  heads 
and  55  tails,  8 = 3 to  53  heads  and  47  tails,  etc.  Denote  by  w(8) 
the  probability  of  the  appropriate  outcome  of  a trial.  The  most 
probable  result  is  8 = 0 so  that 

-m-yi  (0 


We  now  compute  the  probability  ze>(8).  This  is  the  probability 

of  occurrence  of  m — — 4-  8 heads  and  k = — — 8 tails,  and  so  it 

2 2 

is  equal  to 


w{  8) 


(i+8)i(f-8)i 


x2+V  8 


where  a = p 


1 

2 


Consequently 


n 


Now  increase  5 by  unity  and  compute  w(8  + 1)  to  get 


w(8  + 1)  = 


- + 8 + l 
2 


n 


w(S  + 1)  1 

(-  + *] 
2 

1! 

IM 

| 

k>(8)  j 

+ 8+  l1 

. 2 J 

II 

(f-8 

Therefore 
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(i+s+1)!=(i+5  + ,)(f+s)! 

(f-*)!=(f-s)(f-s->)! 


a>(8  + l) 
®(8) 


” -8 
2 


- + 8 + 1 
2 


i-i* 


i + 


2(8  + 1) 


Taking  logs  of  the  right  and  left  sides,  we  get 
In  w( S + 1)  — In  8)  = In  1 1 — — j — In  ^ 1 + 


2(8+  1) 


) (2) 


We  consider  a large  number  of  trials  n and  assume,  besides,  that 
the  quantity  , which  means  we  are  studying  small  deviations 

2§  2(§  -j-  j\ 

from  the  most  probable  result.  Therefore  the  quantities  — and  — 

n n 

are  small  and,  consequently,  the  logarithms  on  the  right  of  (2) 
can  be  expanded  in  series.  In  this  expansion,  we  confine  our- 
selves to  the  first  terms  of  the  series: 


In 


2S  ^ , 2(3  + in  2(8  + 1) 

n \ n J n 


Formula  (2)  assumes  the  form 

In  w(8  + 1)  — In  w(S)  = 


Note  that  In  z*>(8  + 1)  — In  ze>(8)  may  be  replaced  approximately 
by  the  value  of  the  derivative  of  the  function  In  w(z)  computed  at 
the  midpoint  of  the  interval,  i.e.  for  a value  of  the  argument  equal 

to  8 + — so  that 
2 


In  ze>(8  + 1)  — In  z£>(8)  = ^ln  w |8  + i jj 

Of  course,  w ^8  + -ij  is  not  the  probability  that  the  deviation 
of  the  number  of  heads  from  the  most  probable  number  will  be  equal 
to  8 + i , for  this  deviation  is  of  necessity  integral.  This  is  the  result 
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of  interpolating  (Sec.  2.1)  the  function  ze>(S)  with  integral  values 
of  the  argument  to  half-integral  values.  And  so  we  have 


d 

d8 


ln^(s  + l)j  = 


Here,  put  8 + ~ = z ; then  d8  = dz  and  we  get 


— In  w(z)  = — — 

dz  n 


Integrate  this  relation  from  z = 0 to  2 = 8 to  get  In  ze'(S)  — In  w(0)  = 
2s2 

= or,  taking  antilogarithms, 

n 


_ 25* 

z#(S)  = w{ 0)  e 11 

Using  (1),  we  finally  get 

®(*)=| (3) 

It  is  easy  to  see  that  this  result  agrees  with  the  requirement 

+ 00 

^ w(S)  d8  = 1 (4) 

— 00 

Knowing  the  type  of  dependence  w oc  e~282‘nt  we  could  have 
determined  ze/(0)  from  condition  (4)  without  resorting  to  Stirling's 
formula.  Actually,  S is  an  integer  and  varies  only  over  the  range 

< S < but  for  large  n ze>(8)  varies  so  slowly  and  at  the  end- 
points is  so  small  that  no  noticeable  mistake  is  introduced  by  replacing 
the  sum  by  an  integral  and  integrating  from  — - 00  to  + 00. 

The  graph  of  w versus  8 is  precisely  of  the  same  type  that  was 
considered  in  Sec.  6.1  in  connection  with  the  definition  of  a delta 
function.  When  n = 2 we  get  the  graph  y(2){x)  shown  in  Fig.  68, 

and  for  other  n the  graph  y(2){x)  has  to  be  compressed  y ^ times 

towards  the  *-axis  and  stretched  ^ ~ times  away  from  the  y-axis. 

The  curve  corresponding  to  formula  (3)  is  called  the  probability 
curve.  It  is  a bell-shaped  curve.  From  (3)  it  is  evident  that  w(—  8)  = 
= Z0(8),  which  is  what  was  to  be  expected:  the  probability  of  55 


By  similar  arguments,  denoting  f(n)  = ln(w!)  and  proceeding  from  the  equation 
f(n  + 1)  — /(«)  = ln{w  + 1),  we  can  derive  the  Stirling  formula  up  to  a constant 
factor.  {Do  this!) 
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heads  and  45  tails  (S  ~ 5)  is  equal  to  the  probability  of  45  heads  and 
55  tails  occurring. 

Using  formula  (3),  we  find  for  n — 100  (when,  as  we  know, 
w( 0)  = 0.08) : 

w(  1)  = w(0)  * ^-°-02  = 0.98  w(Q), 
w( 2)  = w( 0)  • e~0  0s  ~ 0.92  w(0), 
ze>(5)  = w(0)  • e~°'s  = 0.61  w(0), 
ze>(10)  = w(0)  • e~2  = 0A4  w( 0), 

w{20)  = w(0)  • = 0.00034  w{ 0) 

Let  us  agree  to  use  the  term  “expected”  for  those  outcomes  of 
trials  for  which  — w(0)  ^ z£>(8)  ^ze>(0).  Cases  with  smaller  probabi- 

e 

lity  will  occur  rarely.  Thus,  expected  events  are  those  for  which 
— ^ 8V  where  81  is  found  from  the  condition  w(8±)  = — w(0) 

(Fig.  187).  This  yields  w(0)  e n = — w( 0)  or  — = 1,  whence  we 

e n 

finally  get  8X  = y ]fln  . Hence,  expected  outcomes  are  those  in  which 

- - Y 2n  < 8 < - ]/2n 
2 2 

In  our  example,  the  most  probable  outcome  is  that  correspond- 
ing to  8 = 0 (50  heads  and  50  tails).  But  its  probability  is  slight, 
being  equal  to  0.08  and  is  only  just  a bit  greater  than  the  probability 
of  close-lying  outcomes.  For  instance,  the  probability  of  obtaining 
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51  heads  and  49  tails  (or  49  heads  and  51  tails)  is  almost  the  same: 
0.98-0.08  = 0.078. 

The  probability  of  obtaining  57  heads  and  -13  tails  (or  43  heads 
and  57  tails)  is  much  less.  This  probability  is  equal  to  = 0.029. 

e 

We  can  therefore  consider  as  expected  the  probability  of  obtaining 
a number  of  heads  ranging  from  43  to  57,  that  is,  50  ± 8 where 
0 < 8 < 8,  = 7. 

The  quantity  is  proportional  to  [/  n,  and  so  the  larger  n is, 
the  broader  the  range  of  the  expected  outcome.  For  instance,  when 

n = 10  000  we  get  = - 2°2Q0°  « 70,  so  we  should  expect  a result 

with  heads  turning  up  from  4930  times  to  5070  times.  However,  the 
portion  of  X,  relative  to  the  number  of  trials  n diminishes  with  increas- 

ing  n,  since  the  quantity  — is  proportional  to  — , i.e.  it  is  the 

n y n 

smaller,  the  greater  n is. 

Suppose  we  toss  a coin  to  find  experimentally  the  probability 
of  heads  turning  up.  And  suppose  we  do  not  know  whether  the  coin 
is  unbiased  or  not  (bent  so  that  one  side  turns  up  more  often  than 
the  other).  Suppose  that  the  coin  is  even  (unbiased)  and  the  probabi- 
lity of  heads  turning  up  is  wh  = 0.5. 

In  100  throws  we  most  likely  will  get  from  43  to  57  heads,  or 
0.43  ^ wh  ^ 0.57.  The  error  in  determining  the  probability  will  not 
exceed  0.07  either  way.  This  means  that  if  we  get  44  tails  and  56 
heads  in  100  tosses,  one  should  not  conclude  that  there  is  a greater 
probability  of  heads.  The  experiment  is  not  sufficiently  exact;  the 
deviation  from  50  lies  within  the  limits  of  statistical  error,  as  we 
say.  All  we  can  say  is  that  the  probability  of  heads  lies  within  the 
limits  = 0.56  ^ 0.07,  which  is  to  say  between  0.49  and  0.63: 
0.49  ^ w ^ 0.63;  w — 0.50  is  not  in  the  least  excluded.  To  make 
this  more  precise,  we  have  to  increase  the  number  of  tosses.  In  the 
case  of  10  000  tosses,  we  will  most  likely  have  4930  to  5070  heads, 
which  is  0.4930  ^ wh  < 0.5070.  Here  the  error  in  determining  the 
probability  will  not  exceed  0.007  either  way.  It  is  clear  that  the  error 

8 1 

-A.  in  determining  the  probability  is  proportional  to  Which  means 

that  to  reduce  the  error  10  times  in  determining  the  probability  we 
have  to  increase  the  number  of  trials  100  times. 

Formula  (3)  is  obtained  on  the  assumption  that  8 is  small.  It 
is  to  be  expected  therefore  that  for  large  8 this  formula  will  yield 
considerable  errors.  To  illustrate,  for  the  case  n = 100  let  us  compute 
the  probability  w(50),  which  is  the  probability  that  heads  turn  up 
all  100  times  and  tails  not  once.  Using  (3)  we  find 

2 ■ 502 

w(50)  = w(0)  e~~w  = 0.08  • e'50  * 10"23 
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On  the  other  hand,  we  have  already  calculated  this  probability  by 
the  exact  formula  and  found  that  it  is  equal  to  10-30.  Thus,  the  approxi- 
mate formula  (3)  does  indeed  produce  substantial  errors  for  large  S. 
But  the  errors  are  not  important  since  for  large  S these  values  of 
the  probabilities  are  practically  negligible. 

What  is  the  probability  zrcxp  of  obtaining  an  expected  result, 
that  is,  a result  for  which  S lies  in  the  range  from  — 8±  to  -f  S1  ? 
This  probability  is  obviously  equal  to 


+ $1 

Wexp  = ^ W($) 

-8, 

or,  using  (3), 

in  _ 2S_2 

*'c*p=] ^ 

4^ 


In  this  last  integral  we  make  the  change  of  variable 


8 = 


tin 


('-#) 


n 


then  d8  ~ dt.  The  integral  will  lie  within  the  range  from  — tx 
to  -f  tlf  where 


Therefore 


\ n 2 \ n 


_ -+  y 2 *2 


V 2 *2 


W< 


exp 


]f±h_  f e~  2dt  = -l=[  e 2 dt 

]l  nn  2 ) y J 


X t* 


Tables  have  been  compiled  for  the  function  0(#)=-^=^<?  2 dt, 

o 

which  is  called  the  probability  integral.  (See  the  table  at  the  end  of 
this  chapter.)  Using  the  table,  we  find 

z^exp  =<&(J/2)  = 0(1.414)  = 0.S42 


Hence  those  cases  we  called  expected  cases*  constitute  84%  of  all 
possible  outcomes  of  trials,  the  rest  making  up  only  16%. 


It  is  clear  that  the  expression  “expected"  is  used  here  in  a conventional  sense 
since  the  probability  of  obtaining  a result  that  is  not  expected  (we  could  hardly 
say  "unexpected")  is  not  so  small,  16%,  under  the  accepted  definition. 
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In  some  problems  one  wants  the  probability  of  a result  for  which 
the  quantity  8 does  not  exceed  a given  80.  This  probability  is 


-n  \ 


Making  the  same  change  of  variable  as  in  the  preceding  case, 
we  get 


‘0 


dt , where  t0  = 


so  that 


w(S  <80)  ~-  + ~ ~ 7 + 


where  O is  taken  from  the  table. 

Now  let  us  consider  the  case  a =£  p.  If  n throws  of  a die  are 
made  (with  the  probability  of  a white  face  turning  up  in  a single 
throw  being  equal  to  a and  of  a black  face,  p),  the  probability  of  m 
white  and  k black  faces  occurring  (m  + k — n)  is,  by  Sec.  13.2, 

w =-^—  am3fc. 

m\k\ 

Let  us  find  the  maximum  of  w as  a function  of  m.  It  is  more 

convenient  to  seek  the  maximum  of  In  w instead  of  w.  (It  is  clear 

that  w and  In  w attain  a maximum  for  the  same  value  of  m.) 
We  have  In  w = In  n ! — In  m ! — In  k ! + m In  a + k In  p or,  recalling 
that  k = n — m,  we  get 

In  w = In  n\  — In  m ! — In (n  — m)\  ~\-  tn  In  a + (n  — m)  In  3 

We  will  assume  that  the  number  of  throws  n is  fixed  and  so  In  w 


depends  only  on  m.  We  find  the  derivative 


without  bein^ 


upset  by  the  fact  that,  by  the  meaning  of  the  problem,  m only  as- 


sumes integral  values.  We  get 


d ln(w  — m) ! 


The  first  term  on  the  right  is  computed  thus: 


Since  in  a large  number  of  trials  n,  the  number  ni  is  most  often  also 

large,  the  quantity  ln(  1 + — 1 » — can  be  neglected  in  comparison 
t m ) m 

with  In  m.  Hence 


d In  m * 
dm 


= In  m 
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And  so 


dln  = — In  m + In (n  — m)  + In  a — In  3 = In  ^ n ^ (5) 


dm  ' ' ’ mp 

The  maximum  condition  — that  the  derivative  be  zero  — yields 


. a (n  — m)  ‘cl  n — m . 

In  — — - — - =0,  or = 1 


Finally, 


P 

a 

P 


This  last  result  is  easy  to  grasp:  the  most  probable  outcome  of  a 
trial  is  that  in  which  the  number  of  whites  (m)  is  to  the  number  of 
blacks  (n  — m)  as  the  probability  of  obtaining  white  (a)  in  a single 
trial  is  to  the  probability  of  obtaining  black  (|3)  in  a single  trial. 

From  the  last  relation  we  get  = cc(n  — m)t  whence  m(cc  + 
+ (3)  = ocn.  Since  a + P = 1,  it  follows  that  m = cun.  Hence, 
k — n — m — n{  1 — a)  = n$.  To  summarize,  the  most  probable 
outcome  is  that  in  which  white  turns  up  a n times  and  black  times.* 

Denote  the  probability  of  this  outcome  by  w(noL).  Then 


w{ny) 


nl 

(a  «)!(p»)! 


By  Stirling's  formula  we  get 


(6) 


n\ 


y2r:n^~ j , (ocn)!  = j 

(;3 n)\  = yiiznfi  (— ] 

\ e J 


and  so 


1 l 


(aw)!(P«)l  |/2:wap  a"a  ^ 

Using  formula  (6),  we  find 

winy)  = 1 


0) 


Let  us  pose  the  problem  of  determining  the  probability  of  an 
outcome  that  deviates  but  slightly  from  the  most  probable  outcome. 
To  be  more  precise,  we  seek  to  determine  the  probability  of  the 
number  of  white  faces  being  equal  to  m = na  + $ where  5 is  small 
in  comparison  with  w. 


Note  that  the  numbers  clu  and  (3w  may  prove  to  be  nonintegral.  Then  the  most 
probable  number  of  occurrences  of  white  is  taken  to  be  equal  to  the  integer  closest 
to  a n. 
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We  proceed  from  formula  (5)  in  which  we  put  in  = not  + 8, 
dm  = d8.  Then  (5)  takes  the  form 


d In  w{nu.  -f  8)  ^ a(Pn  — 8) 

(an  + 8)p 

Transform  the  right-hand  side  as  follows: 


In?(P”-s>  =ln 

(an  + 8)P 


—4 — + 


8 8 

Since  — and  * — are  small  in  comparison  with  unity,  we  can  put 

np  na 


l„fl-±)=_±,  Infl  +-*  1 » 

V np  / np  na  J na 


whence  we  get 


Thus 


d In  w(n<x  + 8) 


d$ 


j>_ 8_ 

np  na 


nap 


d In  ze>(na  4- 


8 


d§  nap 

Integrating  this  equation  from  0 to  8,  we  get 
In  w{noL  + S)  — In  w{nu)  = 

whence 


82 


2naP 


82 

w(n<x  + 8)  = w(n<x)  e 2na$ 

Recalling  that  w(m :)  = — 1 - , we  finally  obtain 

V 27rapn 

1 ~5- 

w(n<x,  + 8)  = — =-  e 2aPw  (8) 

V 2:rapn 


This  is  just  the  probability  of  the  outcome  in  which  the  number  of 
white  faces  of  a die  differs  from  the  most  probable  outcome  by  8. 

We  again  call  expected  those  outcomes  for  which  —w(n<x)^w(n<xJr 

e 

+ 8)  ^ w(n<x),  that  is,  for  which  — 8X  ^ 8 ^ 8X,  where  Sx  is  deter- 
mined from  the  condition  w(hol  + 8X)  = — w(n<x).  Using  (7)  and  (8)> 

e 

we  get 

8? 

1 -9 1 si  i 

y 27rapn  e Y 2?rapn  2apn 

and,  finally,  5,  = ^2ap«. 
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+ oo 

Using  (8),  we  can  readily  prove  that  ^ w(n<x  +8)  rfS  = 1.  It  is 

— 00 

also  easy  to  verify  that  when  a = (3  = 1/2  we  get  the  formulas  deriv- 
ed at  the  beginning  of  this  section. 

Exercises 

1.  A coin  is  tossed  1000  times.  What  is  the  probability  of  obtaining 
exactly  500  heads?  exactly  510  heads? 

2.  A coin  is  tossed  1000  times.  What  is  the  probability  of  obtaining 
at  least  500  heads?  at  least  510  heads? 

3.  One  hundred  shots  are  fired  with  a probability  of  hitting  the 
target  equal  to  0.1,  the  probability  of  a miss  being  equal  to  0.9. 
What  is  the  probability  that  the  target  will  be  hit  exactly  10 
times?  exactly  eight  times? 

4.  Under  the  conditions  of  Exercise  3,  what  is  the  probability  that 
the  target  will  be  hit  at  least  8 times?  at  least  10  times?  at  least 
1 2 times  ? 

5.  A total  of  1000  shells  are  fired  with  a probability  of  a hit  equal 
to  0.01.  What  is  the  probability  that  at  least  8 shells  will  hit  the 
target  ? at  least  1 1 shells  ? 

13.4  Entropy 

Thus  we  have  seen  that  the  range  of  expected  outcomes  is 
proportional  to  ]/n,  whereas  the  range  of  all  conceivable  outcomes 
is  equal  to  n.  With  our  definition  of  expected  outcomes  — when  it 

was  required  that  their  probability  exceed  — of  the  maximum  — 

e 

they  constitute  84%  of  all  outcomes  for  a large  number  of  trials. 
We  could  give  an  alternative  definition  of  expected  outcomes,  for 

example,  by  replacing  the  coefficient  — by  0.001.  We  would  then 

e 

find  that  is  still  proportional  to  ]jn  (with  a different  constant  of 
proportionality),  and  the  expected  outcomes  would  constitute  about 
99.98%  of  the  total,  etc.  At  the  same  time,  the  interval  of  expected 
outcomes  for  very  large  n constitutes  only  a minute  fraction  of  the 

interval  of  all  conceivable  outcomes,  since > 0 as  oo. 

n 

This  law  has  many  important  applications  in  physics.  It  states 
that  the  values  of  a random  variable  (in  this  case,  the  number  of 
occurrences  of  a white  face)  for  a sufficiently  large  number  of  trials 
concentrate  about  the  most  probable  mean  value  with  an  arbitrarily 
high  relative  accuracy.  Here  it  is  common  to  use  the  principle  of 
replacing  a large  number  of  trials  involving  a single  entity  by  a single 
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trial  involving  a statistical  array , which  is  a system  consisting  of  a 
large  number  of  the  same  entities.  For  example,  instead  of  throwing 
a die  many  times  we  can  consider  the  simultaneous  throwing  of  many 
dice,  and  the  number  of  white  faces  that  turn  up  will  obey  the  very 
same  law,  which  is  to  say  that  they  will  concentrate  about  the  most 
probable  value. 

Let  us  first  consider  a simple  imaginary  example.  Suppose  there 
is  an  enormous  pack  of  cards  (say,  N = 10 12  cards)  including  aN 
red  cards  and  piV  black  cards  (a  + p = 1).  Cards  of  a single  colour 
are  taken  to  be  indistinguishable.  If  we  assume  the  state  of  the  pack 
to  be  the  order  in  it  of  the  red  and  black  cards,  then  it  is  easy  to  see 
that  the  total  number  £2  = £2(iV,  a)  of  possible  states  is  equal  to  the 
number  of  combinations  of  N elements  taken  a N at  a time,  or 

Q= — 

(aN)  ! (PAT)  ! 

Taking  advantage  of  the  Stirling  formula,  we  get  (see  the  computa- 
tions on  page  531) 

o = _ 1.  ..  _L  _L 

y 2:riVap  a0^  p0 N 

We  have  seen  that  by  far  not  all  these  states  are  equally  pro- 
bable. For  example,  it  is  easy  to  calculate  that  the  probability  that 
for  N = 1012,  a = 0.5,  the  mean  density  of  red  cards  in  the  first 
1010  cards  will  exceed  (in  the  case  of  decent  shuffling)  by  more  than 
1%  their  mean  density  throughout  the  pack  (equal  to  0.5).  And 
since  1010  is  only  a small  part  of  1012,  we  can  take  it  that  the  some- 
what elevated  frequency  of  occurrence  of  red  cards  in  this  first  por- 
tion of  the  pack  does  not  alter  the  probability  of  0.5  that  the  next 
card  in  each  case  is  red.  Thus,  the  desired  probability  is  equal  to 
the  probability  considered  in  Sec.  13.3  that,  in  1010  tosses  of  a coin, 
heads  turns  up  ^ 1.01  -0.5*  1010  = (0.5-  1010  -f-  0.5-  108)  times.  By 
virtue  of  (3),  the  sought-for  probability  is  equal  to 

10* 

(Here  we  put  2-  10-5  § = t.) 

Our  table  of  values  of  the  probability  integral  does  not  contain 
entries  for  such  large  values  of  the  argument,  but  using  the  method 
described  on  page  69  makes  it  easy  to  obtain,  for  large  x , 

1 _ m =2 

W fhi  J \2nx 
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And  so  the  desired  probability  is  equal  to  -2 = 10-3  £-1°6'2*  10~2'  10\ 

This  is  an  unbelievably  small  number.  We  can  be  quite  sure  that 
if  the  pack  is  shuffled  and  checked  thousands  of  millions  of  times 
every  second,  then  the  foregoing  increase  in  the  mean  density 
of  red  cards  in  the  first  portion  of  the  pack  will  never  be  recorded 
during  the  whole  lifetime  of  the  solar  s}rstem.  The  state  of  the  pack 
must  be  such  that  any  portion  of  it  consisting  of  a very  large  number  n 
of  cards  contains  roughly  cm  red  cards  and  fin  black  cards  with  the 
range  of  error  of  the  order  of  ]/  n.  Of  course,  if  n is  small,  n = 10  or 
100,  then  the  mean  density  of  red  cards  in  some  portion  may  prove 
to  be  substantially  different  from  a,  that  is,  the  density  may  have 
local  fluctuations . 

We  use  the  term  entropy  of  the  pack  of  cards  for  the  number 
S = InQ,  that  is,  the  logarithm  of  the  number  of  its  possible  states. 
From  the  foregoing  we  get 

S * «N  In  - + (W  In  - - - InpirapN)  w2V  fa  In  - + <3  In  -I] 

a 0 2 la  P ) 

On  the  right-hand  side  we  neglected  the  term  of  the  order  of 
]nN<^AT.  Referred  to  a single  card,  the  entropy  is  equal  to 

— = a In  — -j-  B In  --  • 

N a p 

Now  imagine  two  packs  of  cards  with  characteristics  Nv  ax  and 
At2,  a2,  respectively,  and  no  interchange  of  cards  between  them  (the 
packs)  being  possible.  Such  a system  has  £1  = possible  states 
(any  state  of  the  first  pack  may  be  combined  with  any  state  of  the 
second  pack),  whence  the  appropriate  entropy  is 

S = In  Q = In  fix  + In  = + S2 


or  the  entropy  of  a system  of  several  noninteracting  components  is 
equal  to  the  sum  of  the  entropies  of  the  components.  Now  if  these 
two  packs  are  shuffled  together,  then  they  constitute  a single  common 
pack,  the  number  of  cards  being  N = Nx  + N2>  of  which  there  are 

apVi  + a2iV2  = a N red  cards,  where  a = . The  appro- 

priate entropy  is  equal  to 


= (a 4-  a2,V2)  In  - + (f3xATx  + ?2N2)  In  ” 

aj^V j -f-  a2N 2 Pi^ i H-  P2^ 2 


It  is  easy  to  verify  that  for  a2  = oq  (and  hence  02  = p:)  it  will  be 
true  that  6'  — S.  This  is  also  evident  from  the  fact  that  for  a2  = a* 
the  interaction  between  packs  (i.e.  their  joint  shuffling)  does  not 
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yield  anything  new.  But  if  a2  =£  <xv 
must  be  true  that  5 > 5.  Indeed, 

5 - S = + a2iV2)  In  " 

! -f-  0C2-^  2 


then,  as  we  will  now  verify,  it 


+ (PiA7 1 + 2)  in 


N 

+ $2^2 


- Nt  f«!  In  — + Pi  In  -M  - N2  (a.  In  — + (32  In  -L) 
l “1  Pi  I \ «a  P2 ) 

If  we  consider  Nv  N2,  and  ax  as  given,  and  a2  as  variable,  and  if  we 
note  that  p2  = 1 — a2,  then  it  is  easy  to  compute  directly  that 

d(S-S)  = ^ (P^t  + P2N2)  «2 

^-2  2 (*lNl  + “2^2)  P2 

*lNlN2  1 Pi^l^2 

(^a2)2  a2(ai^i  + “2-^2)  PgtPr^i  + @2^2) 


For  a2  = ax  we  have  5 — S = 0,  S ^ ■ = 0,  and  since  from 

the  expression  of  the  second  derivative  it  follows  that  it  is  positive 
for  all  a2  and  for  this  reason  the  graph  of  5 — 5 as  a function  of  a2 
is  convex  downwards,  then  for  a2  oct  we  will  have  5 — 5 > 0,  i.e. 
5 > 5,  which  is  what  was  asserted. 

If  we  imagine  that  the  two  packs  with  characteristics  Nv  a2 
and  N2>  a2  are  in  juxtaposition  but,  due  to  the  large  volume,  exhibit 
only  “local  shuffling”  of  comparatively  small  portions  of  the  com- 
bined pack,  then  for  a2  > there  will  occur  (as  the  shuffling  takes 
place)  a gradual  diffusion  of  red  cards  from  the  second  pack  into  the 
first,  and  the  mean  density  of  red  cards  will  gradually  even  out. 
If  at  any  time  in  this  process  we  put  up  a barrier  between  the  first  Nt 
cards  and  the  last  N2  cards,  we  can  stop  the  diffusion  and  compute 
the  entropy  in  the  intermediate  state  after  such  a separation.  This 
entropy  increases  as  the  evening  out  of  the  mean  density  of  red  cards 
takes  place. 

After  the  evening  out  process,  the  entropy  remains  practically 
constant.  Of  course,  we  can  imagine  that,  due  to  the  constant  shuf- 
fling, the  mean  density  of  red  cards  in  one  part  of  the  pack  will 
become  substantially  greater  than  in  the  other.  We  have  already 
demonstrated,  however,  that  in  the  case  of  a large  number  of  cards 
the  probability  of  such  an  event  is  inconceivably  small.  Thus,  the 
process  of  entropy  increase  is  irreversible. 

The  pack  of  cards  we  have  just  been  discussing  is  the  simplest 
model  of  a statistical  physical  system.  The  model  is  one-dimen- 
sional (the  cards  are  arranged  in  a row)  and  each  card  can  exist  only 
in  two  states:  red  or  black.  (For  an  entity  that  can  be  in  several 
states  with  probabilities  pif  the  entropy  referred  to  a single  item 

turns  out  equal  to  J^^ln  — | . Similar  properties  are  exhibited  by 
i Pi  ) 
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more  complicated  actual  statistical  systems  like,  say,  the  system  of 
molecules  of  a gas  contained  in  a specific  volume. 

According  to  quantum  mechanics,  every  value  of  energy  E 
is  associated  with  a definite  and  very  large  number  Q of  the  quantum 
states  of  the  system  at  hand,  the  internal  energy  of  which*  does  not 
exceed  E.  To  get  an  idea  of  the  order  of  this  number,  let  us  take 
advantage  of  the  approximate  estimate,  in  quantum  mechanics, 
of  the  number  Qx  of  states  of  a single  molecule  in  a volume  V,  whose 
momentum  does  not  exceed  a certain  value  p\  Qx  = pWj67:2?P, 
where  h is  Planck's  constant,  10“27  erg-s.  The  mean  momentum 
of  a single  molecule  is  connected  with  the  temperature  t of  the  gas 
by  the  formula  p = ]/3Awt,  where  k is  the  Boltzmann  constant, 
k — 1.38*  10" 16  erg/degree,  and  m is  the  mass  of  a molecule  equal 

to  — $ where  u is  the  molecular  weight  and  N0  = 6.02*  1023  (Avo- 
^0 

gadro's  number)  is  the  number  of  molecules  in  a gram-molecule  of  gas. 
For  oxygen  (jx  = 32),  in  1 litre  = 103  cm3  at  t=  0°C  — 273  K,  we  get 

n,=( 3-1.38-  10-« — Z73)3/2  • 103/6t:2(10-27)3»2.5-  1G29 

1 [ 6.02  ■ 1023  ) 1 

But  in  1 litre  of  gas  under  normal  conditions  there  are  about 
N — 3-  1022  molecules.  And  so  we  can  approximately  take 

£2  = £2*  = (2.5-  lO29)3-1022  ^ 10102< 

for  the  total  number  of  quantum  states  of  such  a portion  of  oxygen. 

This  is  a fantastically  large  number.  Incidentally,  as  will  be  seen 
from  what  follows,  this  number  does  not  actually  occur  in  calcula- 
tions, but  rather  its  logarithm  (which  too  is  very  great)  and,  moreover, 
even  the  difference  of  the  values  of  this  logarithm  for  a system  under 
different  conditions. 

The  quantity  5 = k In  £1  is  called  the  entropy  of  the  system : 
here,  k is  a fundamentally  inessential  proportionality  factor  equal 
to  the  Boltzmann  constant.  The  appearance  of  this  factor  is  due  to 
the  fact  that  the  concept  of  entropy  was  originally  introduced  for 
other  reasons,  in  connection  with  the  study  of  thermal  processes. 
For  this  reason,  the  unit  for  measuring  entropy  is  chosen  so  that  it 
retains  the  same  value  as  in  thermodynamics.  In  other  words,  the 
coefficient  k enables  us  to  obtain  the  temperature  in  degrees  instead 
of  ergs. 

As  E varies,  so  does  £2  and,  hence,  S as  well;  and  in  the  case 
of  a constant  volume  of  gas  the  ratio  ^ = r is  called  its  tempera- 
ture. In  a joint  consideration  of  two  systems  a and  b,  as  long  as 
they  are  not  interacting,  we  have  12  = f°r  the  combined 

system,  and  so  S — Sa  Sb.  If  the  systems  can  exchange  energy 
but  are  otherwise  independent  (the  portions  of  gas  are  separated  by 
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a thin  partition),  then  the  same  relations  hold  true  but  the  partici- 
pating quantities  will  vary  in  the  course  of  time.  If  the  system  as 
a whole  is  isolated  from  the  ambient  medium,  then  the  energy  is 
conserved:  E = Ea  + Eb  — constant,  whence  Ea  -f  Eb  = 0,  where 
the  dot  denotes  the  time  derivative.  From  this  we  have 

s =Sa  + = ± Ea  + ± Eb  =f-L  — i)£. 

Hence,  for  ra  = zb  we  have  5 = 0;  the  entropy  does  not  vary  at 
equal  temperatures  of  the  interacting  portions  of  gas.  But  if,  say, 
the  temperature  of  the  system  a is  less  than  that  of  the  system  b, 
t0  < t6,  then  Ea  > 0 and  S > 0,  which  means  the  system  a acquires 
energy  from  b and  the  entropy  of  the  overall  system  increases.  A 
similar  increase  in  entropy  occurs  in  the  case  ra  > zb  as  well.  This 
process  of  pumping  energy  and  increasing  entropy  continues  until 
the  temperatures  even  out,  after  which  the  entropy  remains  practi- 
cally constant.  It  is  logically  possible  to  conceive  of  a case  where  in 
the  exchanges  of  energy  the  temperature  at  some  time  in  one  por- 
tion of  the  system  again  becomes  greater  than  in  another  portion, 
but  this  is  immeasurably  less  probable  than  the  earlier  discussed 
variation  in  the  density  of  red  cards.  Incidentally,  extremely  small 
portions  of  gas  can  exhibit  fluctuations  of  temperature  and  of  the 
associated  particle  velocities,  as  witness  the  familiar  phenomenon 
of  the  Brownian  motion. 

Since  the  entropy  of  a statistical  system  can  only  increase  or 
remain  constant,  the  processes  in  which  the  entropy  increases  are 
termed  irreversible  processes.  These  processes  become  impossible 
if,  when  filmed,  they  are  played  back  in  reverse  order.  Mathemati- 
cally, inversion  of  a process  reduces  to  replacing  the  time  t by  — t. 
Thus,  if  the  law  of  development  of  a process  is  recorded  in  the  form 
of  a linear  differential  equation  not  containing  the  time  t explicitly, 
then  the  condition  of  reversibility  consists  in  the  presence,  in  the 
equation,  of  derivatives  with  respect  to  t solely  of  even  order.  We  can 
illustrate  this  in  the  case  of  the  linear  oscillator  that  was  considered 

in  Sec.  7.3:  the  term  h in  equation  (21)  of  Ch.  7 corresponded 

to  friction  in  the  system  and  friction  leads  to  a dissipation  of  energy 
and,  thus,  to  irreversibility  of  the  process. 

Note  in  conclusion  that  the  principle  of  entropy  increase  (the 
law  of  nondecreasing  entropy)  or  the  so-called  second  law  of  thermo- 
dynamics is  of  a probabilistic  nature  and  is  ultimately  connected 
with  the  fact  that  the  more  probable  a state  of  nature,  the  more 
often  it  occurs.  To  put  it  differently,  processes  associated  with  decrease 
of  entropy  are  not  actually  observed  not  because  of  their  logical  or 
formal  contradictory  nature  but  because  of  their  extremely  small 
probability. 
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Let  us  consider  the  phenomenon  of  radioactive  decay.  As  we 
know  (see  HM,  Sec.  5.3),  the  probability  of  a single  atom  decaying 
in  an  extremely  small  time  t1  is  equal  to  wtx\  the  probability  that 
decay  will  not  occur  in  this  time  is  equal  to  1 — wtv  (Here,  w is  a 
constant  describing  the  given  radioactive  material.) 

Consider  a long  time  interval  t.  Let  us  find  the  probability  w{t) 
that  no  decay  will  occur  in  that  time  interval.  To  do  this,  parti- 
tion the  interval  t into  small  subintervals  of  duration  tv  t2> 

The  probability  that  decay  will  not  occur  during  time  ti  is  equal  to 

1 — wti 

The  probability  w{t)  is  equal  to  the  product  of  the  probabilities  that 
decay  does  not  occur  in  any  one  of  the  time  intervals  tv  t2>  ...,  tn. 
Therefore 

w(t)  — (1  — wt±)  • (1  — wt2)  ...  (1  — wtn) 

Consider  In  w{t).  It  is  clear  that 

In  w(t)  = ln(l  — wtj)  + ln(l  — wt2)  + •••  + ln(l  — wtn) 

Since  the  quantities  wtlf  wt2i  ...,  wtn  are  small  in  comparison  with  1, 
the  logarithms  on  the  right  can  be  expanded  in  a series.  Confining 
ourselves  to  the  first  term  of  the  expansion,  we  get 

In  w(t)  = — wt1  — wt2  — ...  — wtn  = —wt 
Taking  antilogarithms,  we  get 

w(t)  = e~wt 

We  thus  obtain  a familiar  result:  the  ratio  of  the  number  of  atoms 
that  have  not  deca}7ed  in  time  t to  the  original  number  of  atoms 
is  e~wt. 

If  the  probability  that  an  atom  will  not  decay  during  time  t is 
denoted  by  (3,  then  [3  = e~wt.  The  probability  a that  during  time  t 
an  atom  will  decay  is  a = 1 — e~wt. 

If  there  are  n atoms,  then  the  probability  w(m}  n)  that  m of  them 
will  decay  and  k = n — m will  not  decay  is  given  by 

w{m,  n)  = ami3*  (9) 

m!  k\ 

where  a = 1 — e~wit  |3  = e~wt  (see  Sec.  13.2). 

Let  us  consider  an  important  special  case:  let  the  total  number 
of  radioactive  atoms  n be  extremely  great  and  the  probability  of 
decay  during  time  t extremely  small  so  that  the  most  probable  number 
of  disintegrations  during  t is  a finite  number,  which  we  denote  by  p. 
Then  p = a n,  as  was  established  in  Sec.  13.3. 
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To  summarise,  we  want  to  know  the  form  of  (9)  if  n increases 
without  bound,  « approaches  zero  without  bound,  and  their  product 
an  = fi  (and  m as  well)  remains  a finite  number. 

/yt  i a™  vt  ^ ocm 

We  consider  the  factor  — — = — : which  we  write  as  follows : 


k\ 


(n  — m)\ 


(n  — m)\ 


- n(n  — 1)  (n  — 2)  ...  (n  — m + 1)  • a" 


= (na)  ( na  — a)  (na  — 2a)  ...  \na  — ( m — 1)  a] 

= ~ a)  (fA  — 2«)  - l>  — (m  — 1)  a] 

Therefore,  for  fixed  m and  for  very  large  n,  we  have 

n\  <xm  m 

» [im 

(n  — m)  l 

Now  let  us  consider  the  quantity  (3*  = $n-m  = (1  — a)n_m.  Since 
a is  an  extremely  small  quantity,  it  follows  that  (3  is  close  to  unity. 
But  the  exponent  n — m is  great  and  so,  replacing  $n~m  by  1,  we 
risk  committing  a substantial  error. 

Let  us  do  as  follows: 

P*-“  = (1  — a)"-m  = (1  ~a)".«(l  — a)" 

(1  — a)m 

since  (1  — a)m  is  close  to  1.  Recalling  that  n<x  = fi,  we  get 


r n ' 

( i.Jtp 

l n ) 

1 » | 

since  — — j is  the  closer  to  e,  the  greater  n is. 

Finally,  from  (9)  we  find,  as  n-+  oo, 

w(m,  n)  = n)  ->  w„(m)  = — (10) 

mi 

The  new  notation  (m)  signifies  the  probability  of  observing  m 
disintegrations  if  the  most  probable  number  of  disintegrations  is  p., 
and  the  number  of  atoms  n is  very  great  so  that  number  of  disinte- 
grations is  a small  portion  of  the  number  of  atoms. 

The  law  expressed  by  (10)  is  called  the  Poisson  distribution . 
Let  us  convince  ourselves  that  the  sum  of  the  probabilities  w^m) 
for  all  values  of  m is  equal  to  1,  that  is,  that 

oo 

w^{m)  — 1 
»=o 


LL771  °°  nm 

E“vW  = E E ^ = e-* ■ * = 1 

tn=0 


m = 0 


m= 0 


Indeed, 


Sec.  13.5  Radioactive  decay.  Poisson’s  formula  541 

i 

Fig.  188 


The  Poisson  distribution  shows  what  the  probability  is  of  observ- 
ing m disintegrations  if  the  most  probable  number  of  disintegrations 
is  u.  and  the  separate  disintegrations  are  independent,  which  is  to 
say  that  the  fact  that  a certain  number  of  disintegrations  have 
occurred  does  not  alter  the  probability  of  obtaining  another  disinte- 
gration (for  this  purpose  we  stipulated  that  the  total  number  of 
radioactive  atoms  n is  great  so  that  m<^n). 

In  contrast  to  the  problems  of  preceding  sections,  here  the  total 
number  of  disintegrations  is  not  restricted  in  any  way.  In  preceding 
sections  we  considered  cases  where  a definite  — albeit  very  great  — 
number  of  trials,  n,  are  carried  out.  We  specified  this  number  n and  it 
entered  into  the  final  results.  In  this  section,  n (the  number  of  atoms 
or,  what  is  the  same  thing,  the  number  of  trials,  a trial  consisting  in 
seeing  whether  an  atom  decays  or  not)  is  assumed  to  be  unlimited. 
Hence,  so  is  the  number  of  acts  of  decay  (disintegration).  Funda- 
mentally, it  is  possible  (though  highly  unlikely)  to  observe  any  (arbi- 
trarily large)  number  of  disintegrations  for  one  and  the  same  most 
probable  number  of  disintegrations  p.. 

If  jjl  is  small,  p.<^  1,  then,  using  (10),  we  find  that  the  pro- 
bability of  not  observing  a single  instance  of  disintegration  is  equal 
to  e-v-xl  — a,  which  is  extremely  close  to  unity.  The  probability 
of  observing  a single  disintegration  is  markedly  less,  namely,  p.*  e~v  x 
« fi(l  — jjl)  « |x.  The  probabilities  of  observing  two,  three,  etc.  di- 
sintegrations diminish  very  rapidly  (they  are  equal  to  p.2/2,  p3/6,  etc.). 

If  p.  is  great,  then  it  is  most  probable  to  observe  p.  disintegrations. 
Indeed,  let  us  find  for  what  m (/lx  is  constant!)  the  quantity  (m) 
has  a maximum.  It  is  more  convenient  to  seek  the  maximum  of 
In  Since  In  w^{m)  = — p,  + m In  p.  — In  ml,  it  follows  that 

djn  = jn  jx  — — (In  m ! ) = In  p.  — In  m 

dm  r dmx 

(see  Sec.  13.3).  Therefore  the  equation  - In  Wu-^  = 0 yields  m = \ u. 

dm 

In  Fig.  188  we  have  the  Poisson  distribution  for  ji.  = 0.5,  (x  = 2, 
pL  = 3.  Note  that  Fig.  188  is  not  exact:  the  quantity  w is  actually 
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given  only  for  integer  values  of  m,  so  that  the  curve  of  w[m)  for  all  mt 
i.e.  for  fractional  m,  say,  is  meaningless.  The  way  to  do  it  would 
be  to  give  for  every  fx  a series  of  quantities  wjfi),  ^(1),  2), 

etc.  depicted  by  vertical  lines  erected  from  integral  points  on  the 
axis  of  abscissas.  The  curve  w^{m)  passes  through  the  tops  of  these 
lines. 

The  aspect  of  the  function  w^m)  for  large  will  be  described 
in  Sec.  13.8. 

Exercises 

1.  Consider  the  ratio  of  w^{m)  to  wjyn  — 1)  and  draw  a conclusion 
about  the  maximum  of  w^m)  for  a given  p. 

2.  Suppose  a certain  quantity  # assumes  in  a series  of  trials  the 
values  xv  ...,  xn  with  probabilities  plf...,pn.  Prove  that  the 
mean  value  x of  this  quantity  in  a single  trial  is  equal  to  x^px  + 
+ ...  + xnPn ■ On  this  basis,  prove  that  the  mean  value  m of  the 
number  of  disintegrations  during  time  t is  just  equal  to  g.. 

13.6  An  alternative  derivation  of  the  Poisson 
distribution 

Here  we  present  an  alternative  derivation  of  the  Poisson  distri- 
bution that  is  based  on  arguments  that  differ  from  those  used  in 
Sec.  13.5.  Imagine  a large  number  of  instruments  (counters)  each 
of  which  records  disintegrations  in  similar  samples  containing  a 
long-half-life  material. 

For  the  sake  of  calculational  convenience,  we  assume  that 
on  the  average  there  is  one  disintegration  in  each  sample  in  unit 
time.  Then,  on  the  average,  there  will  be  t disintegrations  in  time  t. 
Denote  by  x0  the  number  of  counters  that  do  not  record  a single 
disintegration  (m  = 0),  by  xx  the  number  of  counters  that  record 
a single  disintegration  (m  = 1),  by  #2  the  number  of  counters  that 
record  two  disintegrations  (m  = 2),  etc.  It  is  clear  that  x0,  xlf  x2,  ... 
depend  on  the  time  that  has  elapsed  from  the  start  of  the  experiment. 
Suppose  at  a definite  time  t we  know  the  quantities  x0  (t),  xx{t)t 
x2(t),  ...,  indicating  the  number  of  counters  that  have  recorded 
0,  1,  2,  ...  disintegrations. 

How  will  these  numbers  of  counters  change  over  a small  time 
interval  dt ? 

In  any  group  of  n counters  there  will  be  n disintegrations  in 
unit  time  and  there  will  be  n dt  disintegrations  in  time  dt.  And  so 
in  this  group,  n dt  counters  will  record  one  disintegration  each. 

Let  us  consider  the  group  of  counters  that  have  not  recorded  a 
single  disintegration.  There  will  be  one  disintegration  in  x0(t)  dt  of 
these  counters,  and  these  counters  will  pass  into  another  group. 
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namely  the  group  xv  Hence  the  number  of  counters  that  have  not 
recorded  a single  disintegration  during  the  time  t -}-  rf/is 
x0{t  + dt)  = x0(t)  — xQ(t)  dt 

whence 

v 

x0(t  + dt)  — x0(t)  = —x0(t)  dt,  or  = —x0 

at 


Consider  the  group  xx(t).  As  before,  xx(t)  dt  of  these  counters 
will  go  into  the  group  x2,  but  x0(t)  dt  counters  from  group  x0  will 
go  into  the  group  under  consideration.  Therefore 

xx(t  + dt)  = xx{t)  — xx(t)  dt  + xQ{t)  dt 

whence 


dXy 

dt 


= Xq 


xx 


Continuing  this  reasoning,  we  arrive  at  a chain 
equations: 


dx_ o 
dt 


dxx 

dt 


= XQ  — X 


1> 


of  differential 


(ii) 


Clearly,  at  the  initial  time  t = 0 there  will  be  x0  = AT,  where  AT  is 
the  total  number  of  counters,  xx  — x2  = x3  = ...  = 0.  The  equations 
(11)  are  easily  solved  one  after  the  other  (see  Sec.  7.2),  and  we  get 

x0  = Ne-‘,  "'j 

xx  = Nte -l, 

x*  = N-e-‘,  , 

x3  = 2V  - e-(, 

3 3! 


Suppose  jut  units  of  time  have  elapsed  since  the  inception  of  the  pro- 
cess. During  this  time  there  will  have  occurred  an  average  of  u disin- 
tegrations in  each  sample,  and  in  each  counter  of  group  m , whose 

number  is  equal  to  xm(u)  = N — e~^t  there  will  have  occurred  m 

ml 


disintegrations.  For  this  reason,  the  probability  that  there  will  be  m 
disintegrations  in  a randomly  chosen  counter  is  equal  to  the  ratio 
of  the  number  of  counters  of  the  group,  xm(\i),  to  the  total  number 
of  counters,  or 


w^m) 


*vi(u)  __ 

N ml 


We  have  obtained  the  same  result  as  in  Sec.  13.5. 
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Exercise 

Consider  the  following  generalization  of  the  problem  that  has  been 
analyzed.  Suppose  a certain  entity  can  be  in  the  states  C_2, 
C_1#  C0,  Clf  C2,  ...,  and  at  time  t = 0 it  was  in  the  state  C0  and 
during  time  it  it  passed  with  probability  co  it  into  the  next  state 
or,  with  probability  a it,  into  the  preceding  state.  Indicate  a 
system  of  equations  that  satisfy  the  probability  pft)  of  being 
in  the  state  Ct  at  time  t}  find  approximate  expressions  for  pft) 
by  the  method  of  successive  approximations,  and  find  the  mean 
value  of  the  number  of  the  state  at  time  t . For  what  values  of 
a and  to  do  we  get  the  problem  discussed  in  the  text  ? 

13.7  Continuously  distributed  quantities 

Let  us  return  to  the  fishing  problem  discussed  in  Sec.  13.L 
Let  the  probability  of  catching  a fish  of  weight  between  p and 
p + dp  be  equal  to  f(p)  dp  (no  fish  caught  will  be  assumed  to  be 
equal  to  a fish  of  weight  zero  — that's  so  that  everyone  will  be  satisfied) . 

The  function  f(p)  is  called  the  distribution  function.  It  satisfies  the 
+ 00 

condition  ^ f(p)  dp  — 1 , since  this  integral  is  the  sum  of  the  proba- 
o 

bilities  of  all  the  events  that  can  take  place. 

Note  that  actually  in  any  body  of  water  there  can  be  no  fish 
whose  weight  exceeds  some  number  P.  However,  we  put  -f  oo  as 
the  upper  limit  of  integration  instead  of  P . This  can  be  done  by 
taking  the  function  f(p)  as  being  equal  to  0 for  p > P or,  at  any  rate, 
as  being  a rapidly  decreasing  function  as  p increases  and  as  having 
such  small  values  for  p > P that  the  value  of  the  integral  is  for  all 
practical  purposes  unaffected  by  these  values*. 

It  is  important  to  note  the  following.  We  consider  that  only  a 
very  small  portion  of  all  the  fish  in  the  pond  are  caught.  For  this 
reason,  the  probability  of  catching  a fish  whose  weight  lies  between 
the  limits  of  p and  p + dp  does  not  depend  on  how  many  fish  have 
already  been  caught  and  what  they  weigh.  In  other  words,  the 
function  f(p)  does  not  depend  on  wThat  fish  have  already  been  caught. 
This  function  describes  a given  rather  large  body  of  water. 

We  pose  the  following  problem.  Suppose  a large  number  n of 
fish  have  been  caught  (to  be  more  precise,  the  fish  hook  has  been 
pulled  out  of  the  water  n times) . What  is  the  mean  weight  of  a single 
fish?  The  probability  of  catching  a fish  weighing  between  p and 


In  other  problems  the  quantity  at  hand  (in  the  given  case  the  weight  of  a fish) 
can  assume  negative  values  as  well.  In  this  case  f(p)  satisfies  the  condition 


Sec.  13.7  Continuously  distributed  quantities  545 

p + dp}  where  dp  is  small,  is  equal  to  f{p)  dp . For  this  reason,  of  the 
total  number  nt  there  will  be  n-f(p)  dp  cases  in  which  a fish  of  weight 
p will  have  been  caught.*  The  weight  of  all  such  fish  is  equal  to 
p*  nf(p)  dp.  Integrating  the  last  expression  over  all  p,  that  is,  from 
p = 0 to  p = + oo,  we  get  the  total  weight  of  the  catch  after 
casting  the  line  n times: 


Pn  = n^p/(p)dp 
0 

Dividing  the  total  weight  of  the  catch  Pn  by  the  number  of  times  n 
the  line  is  cast,  we  get  the  mean  weight  of  a fish  referred  to  a single 
cast  of  the  line: 

oo 

Px  = -\pM*P  (I2) 

n J 
0 

We  now  pose  a more  complicated  problem.  Suppose  n fish 
have  been  caught.  The  probability  that  the  total  weight  of  the  catch 
will  lie  within  the  range  from  p to  p + dp  is  Fn(p)  dp.  The  function 
Fn{p)  is  the  distribution  function  over  the  weight  of  a catch  consist- 
ing of  n fish.  Let  us  attempt  to  find  this  function.  To  do  this  we 
first  set  up  an  equation  connecting  Fn+l(p)  and  Fn(p). 

Suppose  we  have  the  distribution  Fn(p)  after  pulling  out  the 
line  n times.  How  is  it  possible,  after  n + 1 extractions  of  the  line, 
to  obtain  a total  weight  of  the  catch  lying  within  the  range  from 
p to  p + dpi 

If  the  weight  of  the  last,  (n  + l)st,  fish  lies  in  the  range  from 
{jl  to  jjl  + d\± , where  d\i  is  much  less  than  dpt  then  for  the  sum  of 
the  weights  of  n fish  and  the  (n  + l)st  fish  to  fall  within  the  given 
range  from  p to  p + dp  it  is  necessary  that  the  weight  of  n fish  lie 
within  the  range  from  p — \l  to  p dp  — p.  (here  we  disregard  the 
small  quantity  d\i). 

The  probability  of  catching  the  (n  + l)st  fish  weighing  from 
(jl  to  p + d\i  is,  as  we  know,  equal  to  /(jj.)  d\L.  The  probability  of 
the  weight  of  the  first  n fish  lying  in  the  range  from  p ~ ,p.  to 
p-V  + dp  is  F„(p  - n)  dp. 

The  probability  of  an  event  consisting  in  the  weight  of  the  first 
n fish  falling  in  the  indicated  range  and  the  weight  of  the  (n  + l)st 


More  precisely,  a fish  whose  weight  is  very  close  to  p,  namely,  within  the  interval 
dp  about  p.  Note  that  for  a quantity  that  varies  continuously  and  is  not  obliged 
to  assume  definite  integral  (generally  discrete)  values,  it  is  meaningless  to  ask 
what  the  probability  is  that  it  will  assume  an  exactly  definite  value:  the  proba- 
bility is  clearly  zero.  Only  the  probability  of  falling  in  a definite  interval  is  mean- 
ingful, and  this  probability  is  proportional  to  the  length  of  the  interval  when 
the  interval  is  small. 
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fish  also  being  in  this  range  is  equal  to  the  product  of  the  probabilities 
of  these  separate  events  and,  hence,  is  equal  to 

/((j.)  dy.  ■ Fn(p  — y.)  dp  (13) 

Note  that  the  total  weight  of  all  fish  caught  that  lies  in  the  indicated 
range  can  be  obtained  in  a multitude  of  way  since  the  weight  \j. 
of  the  last  fish  can  assume  any  value  from  0 to  -f-  oo.  And  so  the 
total  probability  that  the  weight  of  all  fish  caught  lies  in  the  in- 
dicated range  is  equal  to  the  sum  of  the  expressions  (13)  written 
for  distinct  values  of  j a.  And  since  p assumes  all  possible  values, 
in  place  of  the  sum  we  have  the  integral.  Thus,  this  probability  is 

OO  OO 

jj  /(n)  dy.Fn(p  — y)dp=dp^  f{y)Fn{p  — (x)  dy  ( 1 4) 

0 0 

However,  by  definition,  the  probability  that  the  total  weight  of 
(n  + 1)  fish  lies  in  the  range  from  p to  p + dp , is  equal  to  Fn+l(p)  dp. 
Equating  this  last  expression  to  the  right  member  of  (14)  and  can- 
celling out  dp}  we  get 

oo 

Fn+l  (P)  = ^ /((*)  Fn(P  ~ tl)dn  (15) 

0 

Knowing  the  function  F^p.)  = /(fx)  (since  Ft(p)  refers  to  the  case  of 
a catch  consisting  of  a single  fish)  and  Fn,  formula  (15)  enables  us 
to  find  Fn+1(p)f  which  is  to  say,  to  pass  successively  from  subscript  n 
to  n + 1.* 

Let  us  consider  a simple  example. 

Suppose  f(p)  = — if  0 < p < q and  f(p)  = 0 for  all  other 
<1 

values  of  p.  This  means  that  the  body  of  water  does  not  have  any 
fish  weighing  more  than  q,  and  the  probability  of  catching  any  fish 
weighing  less  than  q is  the  same.  For  this  reason  there  are  no  “zero” 
fish  (in  f{p)  there  is  no  delta  term).  The  mean  weight  of  a caught 
fish  is 

f 

o 

* Actually,  in  (15)  the  integration  is  performed  from  0 to  p since  Fn(p  — jx)  is 
equal  to  zero  for  \l>  p,  and  therefore  (x  can  assume  values  only  between  0 
and  p.  If  instead  of  the  weight  of  a fish  we  consider  a quantity  that  can  assume 
values  of  both  signs,  then  in  (15)  the  integration  is  carried  out  from 
— oo  to  +oo. 
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It  is  clear  that  the  condition  ^f(p)dp  = 1 is  fulfilled.  Indeed, 

0 

\f(P)  dp  = \f(p)  dp  = J Up  = = 1 

ooo 

As  has  been  pointed  out  above,  Fx(/>)  = f(p).  Let  us  find  F2(/>). 
Using  (15),  we  get 

00 

UP)  = \M'  RP-\>)dv. 

0 

Since  /(4u)  is  different  from  zero  and  is  equal  to  — only  when  0<[x<^, 

<1 

<? 

then  F2{p)  = [ — f{p  — (x)  d\i.  Put  p — (i  = t in  this  last  integral; 
J ? 

o 

then  dt  — — d\x  and  we  get 

p-q  p 

F,{p)  = -^f(t)dt  = ^f{t)dt 

p p-q 

Now  let  us  consider  separately  the  case  0 < p < q and  the  case 
q<P  <2q.  If  0 < p < q,  then  p — q < 0.  Taking  into  account 
that  the  function  f(t)  is  different  from  zero  (and  is  equal  to  1/q) 

p 

only  when  0 < t < q,  we  get  F2(p)  = — i — dt  = — . But  if  q <p  < 2 qf 

~ q ) q q2 

0 

then  we  integrate  from  p — q to  q.  And  so  in  this  case  F2(p)  = 

= I ( i*_^.  Thus 
q ' ? ?2 

p-q 

■t  if  0 < p < q, 

F2(P)  =j  r 

if  q<p  <2q 

Also  note  that  F2(p)  =0  if  p < 0 and  if  p > 2 q since  the  weight 
of  a catch  cannot  be  negative  and  the  weight  of  a catch  consisting 
of  two  fish  cannot  exceed  2 q for  the  reason  that  there  are  no  fish 
in  the  pond  weighing  more  than  q.  The  graph  of  function  F2(p)  is 
shown  in  Fig.  189.  We  suggest  that  the  reader  obtain  the  function 
F3(p)  and  construct  its  graph.  (See  Exercise.) 

The  following  are  the  two  simplest  properties  of  the  functions 

F„(P). 
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1 . ^F„(p)  dp  — 1 (for  any  n).  This  property  is  obvious  if  we 
0 

proceed  from  the  fact  that  Fn(p)  is  the  distribution  function.  The 
reader  will  find  no  difficulty  in  proving  that  this  relation  holds  true 
for  the  functions  F2(p ),  F3(p)  of  the  foregoing  example. 

2.  Denote  by  pn+l  the  mean  weight  of  a catch  consisting  of  (n  + 1) 
fish.  This  is  to  be  understood  as  follows.  Suppose  we  cast  the  line 
(n  + 1)  times  in  a row  and  then  compute  the  mean  weight  of  the 
catch  consisting  of  (n  + 1)  fish.  Then  pn+l  ==  pn  + pi,  that  is,  the 
mean  weight  of  the  catch  consisting  of  (n  + 1)  fish  is  equal  to  the 
sum  of  the  mean  weight  of  the  catch  consisting  of  n fish  and  the 
mean  weight  of  the  catch  consisting  of  a single  fish.  Therefore 

Pz  = Pi  + Pi  = 2Pi>  Ps=p2+Pi  = $Pi>  etc.;  that  is, 


Pn  = np ! (16) 

We  conclude  with  a few  general  properties  of  random  variables , 
which  are  quantities  that  assume  definite  values  as  a result  of  a 
trial.  (For  example,  the  weight  of  a fish  caught  in  a trial  — pulling 
in  the  fish  line  — is  a random  variable.)  Suppose  we  have  two  random 
variables  £ and  yj ; denote  by  p,  respectively  by  q,  the  possible  va- 
lues of  these  quantities,  and  by  f(p)  and  <p(#)  the  corresponding 
distribution  functions.  As  we  have  seen,  the  mean  value  \ (we  can 

oo 

also  write  p)  of  £ can  be  computed  from  the  formula  \ = ^ pf{p)  dp, 

— 00 

The  quantity  yj  is  expressed  in  similar  fashion.  It  is  easy  to  de- 
monstrate that  we  always  have 

\ + v)  = \ + C£  = C\  (C  = constant)  (17) 

Indeed,  if  we  denote  by  pt  and  qt  the  values  of  5 and  v;  in  the  z’th 
trial,  then  for  an  extremely  large  number  N of  trials  we  can  write 


N i= i N i=  i N 


The  second  equation  in  (17)  can  be  verified  similarly. 
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If  the  variables  \ and  y]  are  independent,  then  it  can  also  be 
demonstrated  that  £yj  = \ • ^ for  the  probability  that  \ will  take 
on  a value  lying  between  p and  p + dpt  and  yj,  a value  between 
q and  q -\-  dq  is,  by  virtue  of  the  independence  condition,  equal  to 
f(P)  dp  y(q)  dq.  The  corresponding  value  of  E,rj  is  equal  to  pq.  There- 
fore, the  mean  value  £yj  is  obtained  from  the  formula 

00  OO  00  OO 

J J M/(/>)  dp  ?(9)  dq]  = J PAP)  dp  - ^ qo(q)  dq  = %-r{ 

— oo  —oo  —oo  —oo 

which  is  what  we  set  out  to  prove. 

The  dispersion  of  J;  about  its  mean  value  is  characterized  by 

00 

the  mean  square  of  the  difference  Al  = ({;  — f)2  = ^ (p  — \)2f(P)  dp, 

— oo 

which  is  called  the  variance  of  (Note  that  the  last  integral  is  indeed 
a positive  number  since  the  integrand  is  positive.)  It  is  easy  to  verify 
that  the  variance  of  a sum  of  independent  quantities  is  equal  to 
the  sum  of  their  variances: 

A|+„  = [(*  + Tj)  - W+W2  = [(5  -!)+(>]-  7))? 

= (T1!)2  + 2(Vrl)  = A!  + 2-  0-0  + A* 

since  i;  — 5;  = 5 — 1 = 1 — !;  = 0.  This  property  can  be  extended 
directly  to  the  sum  of  any  number  of  independent  random  variables. 

From  this  it  follows,  in  particular,  that  for  the  sum  1;  of  n inde- 
pendent values  of  the  variance  A|  is  equal  to  A£  = wAf. 


Exercise 


Find  the  function  F3(p)  for  the  case 


M = 


- if  0 < p < q, 
< ? 


[0  if  p < 0 or  p > q 
Construct  the  graph  of  this  function. 


13.8  The  case  of  a very  large  number  of  trials 

In  this  section  we  consider  the  behaviour  of  the  function  Fn(p) 
that  was  introduced  in  Sec.  13.7  for  very  large  values  of  n . For 
the  sake  of  simplicity,  instead  of  the  weight  of  a fish  we  first  consider 
a variable  the  mean  value  of  which  is  zero.  Such  a quantity  can 
of  course  take  on  values  of  both  signs.  Fn(p)  is  the  distribution  func- 
tion of  the  sum  of  n independent  values  of 
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We  start  with  the  formula  (15)  obtained  in  Sec.  13.7.  But,  as 
was  pointed  out  in  the  footnote  on  page  546,  the  integral  in  the  for- 
mula must  be  taken  over  the  range  from  — oo  to  oo.  Expand  Fn(p  — fx) 
in  a series  of  powers  of  p,  confining  yourself  to  the  term  containing 
(i2,  to  get 


Fn(p  -,x)  = Fn(p)  - n + j 

Using  this  expansion,  we  get,  from  (15), 

OO  OO 

dFn(P) 


(18) 


Fn+1[P)  = Fn(P)  ^ f(p)  dp  - $ f(p)  (x  dp 

— oo  — oo 

, 1 *Fn{p)  ^f{[L)[L2d[L  (19) 


dp2 


Note  that  ^ f(p)  dp  = 1 since  f(fi)  is  the  distribution  function. 

— OO 

00 

Besides,  J /(p.)  p rfp  = 0 for,  by  virtue  of  Sec.  13.7,  this  integral  is 
— 00 

equal  to  the  mean  value  of  Finally,  we  introduce  the  variance 
00 

Af  = $ p2/(p)  dp  of  l,. 


Now  formula  (19)  yields 

Fn,m=Fn{P)+^\^7~ 


dp 2 


(20) 


Now  let  us  consider  Fn{p)  as  a function  of  two  variables  p and 
n and  write  Fn(p)  = F(p ; n).  We  will  make  use  of  the  partial  deriva- 
tives with  respect  to  p and  also  with  respect  to  n,  and  we  will 
not  be  disturbed  by  the  fact  that  n takes  on  only  integral  values, 
since  for  large  n a change  of  n by  one  unit  may  be  regarded  as  an 
extremely  small  change  in  comparison  with  n.  We  rewrite  (20)  thus: 

F(P;n+  l)=F(p;n)+±A*^  (21) 

We  now  expand  the  left  side  in  a Taylor  series,  confining  ourselves 
to  the  first  two  terms,  which  yields 

F(p;n+  1)  =F(p;n)  +d-f-  1 (22) 

on 

Equating  the  right  sides  of  (21)  and  (22),  and  then  dropping  the  term 
F(p;  n),  we  arrive  at  the  basic  equation  for  the  function  F : 

<*£:_!  A2  — 

dn  2 1 dp2 


(23) 
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Before  giving  the  solution  of  this  equation,  we  carry  out  a supple- 
mentary investigation  of  the  function  F(p ; n),  which  will  do  much 
to  clarify  the  use  of  Taylor's  formula.  We  consider  the  graph  of  the 
function  F(p;  n)  —Fn(p),  which  is  the  distribution  function  of  the 
random  variable  £.  Since  % = = 0,  the  centre  of  gravity  of 

the  graph,  for  all  n,  is  located  at  the  origin.  For  the  width  of  the 
graph  we  can  take  the  root-mean-square  deviation  An  = J/  AJ , which 
has  the  same  dimensions  as  £.  However,  at  the  end  of  Sec.  13.7  we 
showed  that  A2  = nA\  or  An=]fnAl;  the  width  of  the  graph  is 
of  the  order  of  f n . But  then  from  the  normalization  condition 
f F dp  = 1 it  follows  that  F cc  n ~1/2,  whence  — cc  w-3/2,  cc  n~5!2, 

J r dn  Bn2 

etc.  And  from  the  estimate  of  the  width  of  the  graph  it  follows  that 

— ccw-1/2:  1 fn  = n~1t  ^-<xn-x:  Vn=^n^2}  etc. 
dp  1 dp2  } 

Thus,  the  expansions  (21)  and  (22)  are  carried  to  terms  of  the  same 
asymptotic  order  and  for  this  reason  (23)  is  asymptotically  exact. 

It  is  curious  that  the  asymptotic  order  of  the  width  of  the  graph 
of  Fn  could  have  been  obtained  directly  from  (23)  by  differentiating 
with  respect  to  the  parameter  under  the  integral  sign  (Sec.  3.6) 
and  integrating  by  parts: 

±m=±  # = 5**= *-  { as  J *■= # 

(Here  we  assume  that  F-»0  sufficiently  fast  as  p->±  oo.)  From  this 
A2  cc  nA\. 

One  should  not  grudge  the  time  spent  on  such  an  analysis. 
On  the  one  hand,  even  without  the  solution  an  important  general 
property  of  F was  obtained  in  the  process,  namely  the  expression 
y nA±  of  the  width.  What  is  more  — and  this  is  quite  important  — 
your  vigilance  was  enhanced.  You  have  a better  grasp  of  the  follow- 
ing general  rule:  in  order  to  introduce  approximations  (i.e.  to  retain 
certain  terms  and  reject  others  in  the  Taylor  series),  one  has  to 
learn  as  much  as  possible  about  the  function. 

Let  us  examine  the  solution  of  equation  (23).  It  can  be  shown 
that  asymptotically,  for  large  nt  the  solution  of  (23)  is  the  function 

F(p;n)  = A -L  e"^2/2”Ai , where  A is  any  constant.  We  give  this 

\n 

without  any  derivation.  The  fact  that  this  function  satisfies  (23)  can 
readily  be  verified  by  setting  up  dF/dn , d2F/dp2  and  substituting 
the  expressions  obtained  into  (23).  We  choose  the  value  of  the  con- 
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stant  A from  the  condition  that  the  requirement  ^Fdp  = 1 be 
fulfilled : 


A f e-t»Wdp  = i (24) 

V«  J 


Make  the  change  of  variable  pf\  f 2n  = t,  dpj\  /2m  = dt.  Then 
(j  dp  = A^x]j2  $ e-t'dt  = A • \]/2- ]fr.  = 1 

— oo  — oo 


Thus,  A^^2tz=  1,  whence  A = 1/AX  |/2-  and,  finally, 

F(j»n)  = — (25) 

Now  let  us  consider  a case  where  the  mean  value  \x  is  not  neces- 
sarily zero.  We  then  put  5i  — li  = 51,  whence  ^ where 

li  = 0.  Therefore  the  sum  ^ of  w independent  values  of  ^ is  equal 
to  the  result  of  adding  the  constant  and  the  sum  5'  made  up  of  n 
independent  values  of  For  n large,  the  sum  £'  has  the  distribu- 
tion law  (25).  But  if  we  add  a constant  to  a random  variable,  the 
corresponding  distribution  function  is  merely  shifted  by  the  amount 
of  that  constant,  which  is  to  say  that  as  a result  we  obtain  the 
distribution  function 


F(p ; n)  = — ±=  (26) 

|/  2nn 

where  we  have  reverted  to  the  notation  of  Sec.  13.7 : px  instead  of 
We  could  have  obtained  the  solution  (26)  directly  by  skipping 
over  the  special  case  (25).  To  do  this,  it  is  necessary  to  obtain,  with 
the  aid  of  the  expansions  (19)  and  (22),  the  equation  for  the  func- 
tion F,  no  longer  assuming  that  px  = 0.  This  equation  is  of  the  form 


dn  dp  2 dp 2 


(27) 


(when  px  — 0 it  passes  into  the  equation  (23)).  After  that  it  is  easy 
to  verify  by  direct  substitution  that  the  function  (26)  satisfies 
equation  (27). 

In  Fig.  190  we  have  the  graph  of  F{p\  n)  for  the  case  n = 4, 
Ax  = 1,  px  = 1 ; for  the  sake  of  pictorialness,  the  scales  on  the  axes 
are  different. 

For  each  specific  n,  the  function  F(p;  n)  is  a bell-shaped  curve 
symmetric  about  the  vertical  straight  line  passing  through  the  point 
of  maximum.  As  is  evident  from  (26),  the  maximum  results  when 
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p ==  nplf  which  coincides  with  the  mean  value  pn  found  in  Sec.  13.7. 
The  height  of  the  maximum  is  Ij&^lTzn,  i.e.  it  is  proportional 
to  A lYn,  as  has  already  been  stated.  Thus,  the  maximum  is  shifted 
rightwards  as  n increases. 

Let  us  determine  the  width  of  the  curve,  i.e.  let  us  find  out 
how  much  we  have  to  depart  from  pm&K  = np1  for  the  height  of  the 
curve  to  diminish  e-fold  compared  to  the  maximum.  For  this  purpose, 
we  have  to  determine  p from  the  condition 


{p-nptf 
1 e 2»A’ 
Az  Y 2tt  n 


i ^ 
Aj  2ttm  e 


whence  — — =1,  or  p — np1  — ± A ^2n. 

2wAJ 

Thus,  p — pm&x  = ± Aij/2 n,  or  the  width  of  the  curve  is  pro- 
portional to  ]/  n,  as  has  already  been  pointed  out.  Naturally,  the 
height  of  the  maximum  is  inversely  proportional  to  the  width  of  the 
curve,  as  it  should  be  when  the  area  under  the  curve  is  preserved. 

Note  that  even  if,  by  its  meaning,  the  variable  under  study 
assumes  only  positive  values,  then  the  function  (26)  produces  non- 
zero values  when  p < 0,  which  is  of  course  at  variance  with  reality. 
However,  when  p < 0,  F(p;  n)  is  so  small  for  rather  large  n that 
this  drawback  is  of  no  practical  value. 

Fig.  191  shows  how  the  exact  curves  F(p , n),  which  are  obtained 
from  formula  (15)  and  are  shown  by  the  solid  lines,  approach  the 
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approximate  curves,  which  are  obtained  from  formula  (26)  for  n = 1 , 
n = 2,  n = 3 and  are  shown  as  dashed  lines.*  These  curves  corres- 
pond to  the  case 


m = 


1 if  0 < p < 1, 

0 if  p > 1 (see  example  in  Sec.  13.7) 


Now  let  us  take  an  example  in  which  the  random  variable 
can  assume  only  two  values:  1 with  a probability  a (0  < a < 1)  and  0 
with  a probability  (3  = 1 — a.  Then  the  sum  £ made  up  of  n in- 
dependent values  of  ^ may  be  interpreted  as  the  number  of  occur- 
rences of  a certain  event  for  n independent  trials  if  the  probability  of 
its  occurrence  in  each  trial  is  equal  to  a.  The  problem  of  computing 
the  probability  of  distinct  values  for  \ was  solved  in  Secs.  13.2  and 
13.3.  In  Sec.  13.3,  with  the  aid  of  the  Stirling  formula,  we  demon- 
strated (formula  (8))  that  for  large  n the  probability  that  £ would 
take  on  a certain  integer  value  non  + 8 between  0 and  n is  equal  to 


w(noi  + S)  = 


1 e~8V2  a(3» 


Since  for  large  n the  distances  between  adjacent  possible  va- 
lues of  £ are  small  in  comparison  with  the  interval  of  all  its  values 
and  even  with  the  interval  of  its  expected  values  (see  Sec.  13.3), 
it  follows  that  for  such  n we  can  regard  £ as  a continuous  random 
variable  with  distribution  density  F(p ; n).  Then  the  probability  that 
E,  will  assume  the  value  na  + 8 may  be  computed  approximately 
from  the  formula 

iwt-rS+Y 

w(noc  +8)  = ^ F(p;  n)  dp  = F(not -j-8;  n) 

hoc+S-T 


Comparing  the  last  two  formulas  and  setting  na  + S = pf  we  get 

However,  for  the  random  variable  at  hand  it  will  be  true  that 
P\  = ti=  1 • a + 0 • 0 = a, 

Af  = &-?!)*=  (1  -«)*«+  (0  - «)*  P = a? 

If  in  the  last  expression  for  F(p;  n)  we  replace  by  Af  and  noc 
by  npl9  we  again  arrive  at  the  formula  (26),  thus  completely  prov- 
ing it  for  random  variables  of  the  special  type  we  are  considering. 


In  Fig.  191  it  was  hard  to  show  that  F(p;  1)  — a discontinuous  function  — is 
depicted  as  a step  with  a vertical  decline  at  p = 1.  The  function  F(p ; 2)  is  repre- 
sented as  a right  triangle  lying  on  the  hypotenuse.  The  points  p — 1,  F = 1 on 
F(p;  1)  and  F(p;  2)  coincide. 
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Suppose  the  variables  \lt  ...,  independently  assume  ran- 
dom values  in  the  interval  — oo,  oo,  each  of  these  variables  having 
its  own  probability  distribution: 

A(*)>  fz(x),  -,/„(*) 


Their  sum  !;  = + £2  + •••  + £»  will  al s°  take  on  certain  random 

values.  Let  its  probability  distribution  be  F(x). 

Proof  is  given  in  probability  theory  that  in  this  case,  if  n is 
great,  we  have 


m = 


i 


2AZ 


(28) 


Such  a probability  distribution  law  is  called  a normal  law . To  specify 
the  formula  (28),  we  have  to  indicate  two  quantities:  x and  A. 

The  factor  in  front  of  e 2A2  is  determined  from  the  condi- 

+ 00  {X  — X)2  + CO 


tion 


^ F{x)  dx  = 

— OO 

+ 00  (x~x)2 


= 1.  Indeed,  let  F(x)  = Ae 


2A2 


then 


^ F(x)  dx  = 


= A \ e dx . Let  us  find  the  last  integral.  Assuming 


A\/2 


+ 00  (x—  x)2 

2AZ 


+ 00 


dx  = A ]/ 2 dt}  we  get  ^ e 2A2  dx  — A ]/ 2 ^ e t2 dt  = A][2n.  From 

— 00  — 00 

+ 00 

the  condition  \ Fix)  dx  = 1 we  find  A • = 1,  whence /l  = — -=-• 

J W ' A/2rt 

— 00 

We  suggest  that  the  reader  verify  (see  Exercises)  that  for  F(x) 
determined  by  the  formula  (28)  the  relations 

+ oo  +00 

^ xF(x)  dx  = x,  ^ (x  — ■ x)2F(x)  dx  = A 2 


hold  true,  that  is,  x and  A2  are  the  mean  value  and  the  variance 
of  the  random  variable 

Earlier,  all  these  relations  were  derived  for  the  particular  case  of 
the  sum  of  n variables  with  the  same  probability  distribution  f(p ). 

Exercises 

+ 00  +00 

1.  Find  ^ xF(x)  dx  and  ^ (x  — x)2F(x)dx,  where  F(x)  is  given  by 

— oo  — oo 

the  formula  (28). 
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2. 


m = 


— if  0 <p  <q, 

0 if  p > q and  if  p < 0 


Use  formula  (26)  to  find  the  probability  distribution  F(p ; n) 
for  the  case  n — 10,  n = 20. 

Hint . First  find  the  quantities  px  and  A?. 

3.  A fisherman  is  catching  fish  in  a pond  containing  fish  weighing 
not  more  than  2 kg.  Suppose  there  is  an  equal  probability  of 
catching  a fish  weighing  between  0 and  2 kg  for  each  cast  of 
the  line. 

(a)  What  is  the  mean  weight  of  a catch  consisting  of  casting  the 
line  20  times  in  succession? 

(b)  What  is  the  probability  that  in  casting  the  line  20  times 
the  total  catch  will  not  exceed  20  kg?  22  kg?  25  kg? 


13.9  Correlational  dependence 


The  theory  of  probability  has  important  applications  to  the 
study  of  relationships  between  quantities.  Up  to  this  point  we  have 
considered  only  functional  relationships,  that  is,  such  that  specifica- 
tion of  one  set  of  values  completely  determines  another  set  of  values. 
However,  let  us  consider,  say,  such  quantities  as  the  length  l and  the 
weight  p of  a fish  that  has  been  caught.  It  is  natural  to  expect 
that  the  longer  the  length  l,  the  greater  p is ; that  is,  that  there  is 
a relationship  here.  But  this  relationship  is  accomplished  only  in 
the  mean,  since  the  length  l of  a fish  does  not  uniquely  determine 
the  weight  py  for  fish  of  the  same  length  may  have  different  weights: 
a long  fish  may  turn  out  to  be  so  skinny  that  its  weight  will  be  less 
than  that  of  a short  fish,  and  so  on. 

This  type  of  “flexible”  dependence  between  quantities  that  is 
accomplished  in  the  mean  is  called  correlational  dependence,  to  dis- 
tinguish it  from  the  “rigid”  or  “inflexible”  functional  dependence. 
Instances  of  correlational  dependence  are  the  relationship  between 
the  age  of  a person  and  his  height,  between  the  knowledge  of  a stu- 
dent and  the  mark  he  gets  at  an  examination,  and  the  like.  One  also 
hears  of  a correlation  between  luck  at  cards  and  luck  in  love. 

A correlation  obtains  when  the  effect  is  felt  of  factors  that  can- 
not be  taken  into  account  due,  say,  to  the  involved  nature  of  their 
influence.  However,  such  a dependence  may  be  the  result  of  necessity, 
irrespective  of  the  complexity  of  the  factors.  The  point  is  that  if, 
say,  two  quantities  x and  y are  functionally  dependent  on  a single 
parameter,  x ==  x(t)  and  y = y{t)t  then  these  two  relations  define  a 
rigid  functional  relationship  between  x and  y (see  page  121).  But  if 
there  are  two  or  more  such  parameters,  that  is 

* = h>  •••>  K)>  y = y[t i.  h>  •••>  h) 


Sec.  13.9  Correlational  dependence  557 


then  these  relations  fundamentally  can  no  longer  determine  a func- 
tional relationship  between  x and  y.  And  only  if  we  take  into  account 
the  frequency  of  realization  of  the  different  combinations  of  values 
of  the  parameters  tlt  t2,  ...,  tn  can  we  speak  of  a relationship  be- 
tween x and  y,  which,  however,  will  only  be  a correlation.  Such 
precisely  is  the  situation  for  the  quantities  l and  p above,  where, 
incidentally,  it  would  be  difficult  to  indicate  the  whole  set  of  essential 
parameters. 

An  important  instance  of  correlational  dependence  is  found  in 
curve  fitting  (Secs.  2.3,  2.4).  Even  if  the  true  relationship y = f(x) 
between  the  physical  quantities  x and  y is  a functional  relationship, 
the  relationship  between  the  measured  values  of  x and  y,  given 
errors  of  measurement,  is  only  a correlation. 

Let  us  consider  a general  case.  Suppose  we  have  two  random 
variables  f;  and  73  that  are  related  in  some  fashion.  As  we  have  de- 
monstrated above,  if  these  variables  are  independent,  then  £73  = £73, 
that  is,  £73  — £7 3 = 0.  For  this  reason,  the  latter  difference  may  be 
taken  for  a crude  measure  of  the  relationship  between  i;  and  73. 
However,  one  more  often  passes  to  a nondimensional  correlation 
coefficient : 


r = r(£,  73)  = 


where  the  denominator  consists  of  the  mean  square  deviations  of 
5 and  73. 

By  what  has  already  been  stated,  this  coefficient  is  equal  to 
zero  for  independent  variables.  (Generally,  the  converse  is  not  true.) 
Now  let  us  consider  another  extreme  case  where  the  relationship 
between  £ and  tj  is  functional  and  also  linear,  i.e.,  73  = a\  + by 
where  a and  b are  constants.  Then  it  is  easy  to  verify  that  A*  = a2A| 

or  A„  = | a | A5.  From  the  equation  A|  = (£  — £)2  = — 2E£  + E2  — 

= i;2  — £2  it  follows  that 

Irj-jrj  = (ag2  - frg)  - + b)  _ ag2  + bj  - a\2  - b\  _ a 

A^At,  Az\a\Az  |a|A|  \a\ 

That  is,  r(5,  73)  = 1 if  a > 0,  and  r(E„  73)  = — 1 if  a < 0. 

It  can  be  demonstrated  (though  we  will  not  do  so  here)  that 
in  all  cases  except  that  of  a linear  functional  relationship,  the  coeffi- 
cient of  correlation  lies  strictly  between  —1  and  1,  v,)<l. 

This  coefficient  characterizes  the  degree  to  which  the  relationship 
between  % and  73  departs  from  a linear  functional  relationship. 

Let  us  take  an  example.  Suppose  the  physical  quantities  x and 
y are  related  linearly,  y — ax  + bt  but  the  values  of  the  coefficients  a 
and  b are  unknown  and  are  obtained  by  an  experiment  in  which 
we  specify  values  of  x and  measure  values  of  y.  For  the  sake  of 
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simplicity  we  will  assume  that  the  values  of  # are  known  with 
utmost  precision  and  the  degree  of  accuracy  in  determining  the  corres- 
ponding value  of  y is  the  same  for  all  x.  Denoting  by  £ the  measured 
value  of  x and  by  tj  the  measured  value  of  y (£,  75  are  random  varia- 
bles), we  assume  that  with  equal  probability  2;  takes  on  all  values 
between  — l and  /,  which  means  the  appropriate  distribution  function 
is  of  the  form 


?(*)  = 


1 

21 

0 


(-/<*</), 
(i*i  >i) 


(29) 


We  also  assume  that  for  every  value  £ = x the  variable  73  is  normally 
distributed  about  the  value  ax  + b with  variance  A2  that  does  not 
depend  on  x.  (It  is  natural  to  assume  the  normal  distribution  law  if 
the  error  in  determining  y is  made  up  of  a large  number  of  indepen- 
dent errors  due  to  a variety  of  reasons.)  We  denote  by  <1>X  (y)  the 
distribution  function  of  73  for  a given  value  £ = x so  that 

A V27z 


Furthermore,  denote  by  f(x,  y)  the  joint  distribution  function 
of  the  variables  2;,  73 ; this  means  that  the  probability  of  a simulta- 
neous occurrence  of  i;  lying  between  x and  x + dx  and  73  lying  be- 
tween y and  y + dy  is  equal  to  f(x,  y)  dx  dy . To  calculate  the  func- 
tion f(x,  y)  note  that  if  a large  number  N of  trials  has  been  carried 
out,  then  the  number  of  simultaneous  occurrences  of  £ lying  between  x 
and  x + dx  and  of  73  lying  between  y and  y + dy  is  equal  to 
[f(x,  y)  dx  dy]N.  On  the  other  hand,  we  can  reason  as  follows:  for  N 
trials  the  number  of  cases  where  2;  lies  between  x and  x + dx  is 
equal  to  Nx  =[<p(#)  dx]  N.  But  of  these  Nx  cases,  the  number  of  cases 
where  73  lies  between  y and  y + dy  is  equal  to 


C4»*(y)  dy]  Ni  = <l>*(y)  ?(*)  dxdy-N 


Equating  both  results  we  find  that  f(x,  y)  = <p(x)  tyx(y).  That  is, 
under  the  assumptions  we  have  made, 


f(x>y) 


l e-ly-(ax+b)]*l2& * 

0 


(1*1  <*). 

(1*1  >1) 


(30) 


Let  us  compute  the  correlation  coefficient  r(%,  75).  For  reasons  of 
symmetry  it  is  clear  that  | = 0,  yj  = b.  The  mean  value  E,v)  is  com- 
puted by  the  general  rule  (the  sum  of  the  products  of  the 
values  of  the  random  variable  into  their  probabilities),  that  is. 
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l V=§xyf(x,  y)  dx  dy , or 

|^=  ( ^ ? _^=e-[y-(^+i>))!/2A^v  = J_  f ^ + 6)  rf*  = - a 

' J J 2/AK2*  - 2/3  3 


— / —00 


r 1 j2 

From  the  formulas  (29)  we  get  A|  = v *2  — dx  — — . In  order  to 

-/ 

compute  wc  first  find  the  distribution  function  §(y)  of  yj  (this 
is  not  the  same  thing  as  tyx(y) !).  Since  ^(y)  dy  is  the  probability  that  tj 
will  he  between  y and  y -j-  dy  and  E,  will  assume  an  arbitrary  value, 

CO 

it  follows  that  <p(y)  d y — ^ [f(x,  y)  dy]  dx.  And  so 

— 00 


-[y-(ax+b)W  2A2 


dx 


whence 


-i 

i 


K =\(y-  W «Ky)  dy=\dy\{y-  b)*^-=e-[MW2*dx 

— 00  — 00  —l 


A2 


After  inverting  the  order  of  integration  and  carrying  out  the  compu- 

2 l2a2 

tations,  which  we  leave  to  the  reader,  we  get  A^  At>  ' 

From  this  we  find  the  desired  coefficient 

l2a 


al 

~A 


mr 


+ 3 


It  is  quite  evident  that  |r|<  1;  as— ->oo  (for  instance,  when 
a = constant  0,  l = constant,  A->0),  r will  tend  to  ±1,  which 
means  the  correlation  is  extremal ; when  — 0 (for  example,  for 

a = constant,  b = constant,  A->oo),  then  r-> 0,  and  the  correla- 
tion is  lost. 

Suppose  an  experiment  is  carried  out  with  N measurements 
made,  then  for  the  values  x = xlt  x2>  xN  we  obtain  the  correspond- 
ing values  y — ylf  y2>  If  the  measurements  are  independent, 

then  such  a set  of  values  has  a joint  distribution  function  equal, 
by  virtue  of  (30),  to 

/(*i>  yi)-f(*2.  yz)  •••/(%.  yN)  = (-~A-^)w  (3i) 
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(Here  the  summation  2 is  taken  with  respect  to  i from  1 to  N). 
If  we  know  the  result  of  the  experiment  and  we  proceed  from  the 
linear  relationship  y — ax  + b,  but  with  coefficients  a and  b 
unknown,  then  it  is  natural  to  choose  them  so  that  this  result  should 
be  in  the  range  of  the  greatest  possible  distribution  density,  that  is, 
that  it  should  be  the  most  probable,  in  a certain  sense.  This  means 
that  the  coefficients  a and  b are  chosen  from  the  condition  of  max- 
imizing the  right  member  of  (31),  that  is  to  say,  minimizing  the  sum 
Z[y(  — (axt  + b)]2.  We  arrive  at  the  method  of  least  squares  that 
has  already  been  discussed  in  Sec.  2.3.  Thus,  the  method  of  least 
squares  is  substantiated  by  arguments  of  probability  theory. 

After  the  foregoing  experiment  we  can  compute  the  correlation 

coefficient  from  the  formula  (note  that  £yj  — £/]  = (i;  — £)  (yj  — r\)) 

y = (j  -j)  to  - I ^ - *)  (yt  - y)IN 

Vk  - If  to  - % f ~ ]/x(*t  - *rz,(yt  - yf/N2 

where  x =^txi[N,  y = Multiplying  the  numerator  and  de- 

nominator by  N,  we  get 

r = 2(^  - *)  {yt  ~ y) 

~N  VS(^ -TV'Hyt  - yf 


If  | yn  | turns  out  to  be  small,  the  suspicion  may  arise  that  indeed 
r — 0 and  the  fact  that  rN  differs  from  zero  arose  due  to  the  natural 
spread  of  experimental  values.  It  can  be  demonstrated  (though  we 
will  not  give  the  proof)  that  if  the  true  value  of  r = 0,  then  for 
large  N the  observed  value  of  rN  approximately  obeys  the  normal 


law  with  mean  value  r N = 0 and  variance  A%  = 1 . Arguing  as 

in  Sec.  13.3,  we  find  that  the  probability  of  the  inequality!  rN  \ < e 
is  equal  to 


i 


e-'2/2A2v 


^ = - 1) 


Equating,  for  example,  the  right  side  to  0.95,  from  the  table  we  find 
that  with  the  probability  0.95  we  have  \rN\  < e,  where  t^N  — 1 = 
= 1.96^2.  Thus,  if  experiment  yields  \rN  |^- 


= , then  with 

UN  - 1 

probability  at  least  0.95  we  can  assert  that  the  correlation  coefficient 
between  the  physical  quantities  under  study  is  different  from  zero, 
that  is,  these  quantities  are  connected  by  a correlation. 

Note  in  conclusion  that  if  the  correlation  coefficient  turns  out 
to  be  zero,  then  this  does  not  yet  imply  the  absence  of  a relation- 
ship, for  this  relationship  may  turn  out  to  be  nonlinear.  Therefore, 
any  empirical  eliciting  of  the  relationship  y(x)  should  always  begin 
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with  a plotting  of  the  experimental  results  on  a coordinate  plane  xy. 
It  may  turn  out  that  the  empirical  points  will  closely  enough  approxi- 
mate a parabola  or  some  other  simple  curve,  which  will  then  deter- 
mine the  functional  relationship  y(x).  Only  if  the  empirical  points 
form  a “cloud”  is  it  necessary  to  treat  them  statistically,  in  which 
case  it  is  advisable  to  apply  the  notion  of  correlation  if  the  “cloud” 
resembles  a linear  relationship,  like  that  shown  in  Fig.  12.  A non- 
linear relationship  can  be  investigated  over  small  ranges  of  x by 
approximately  assuming  it  to  be  linear,  or  linearity  can  be  sought 
with  the  aid  of  a preliminary  transformation  of  the  variables  by  the 
methods  given  in  Sec.  2.4. 

13.10  On  the  distribution  of  primes 

Here  is  an  interesting  example  of  the  application  of  probability 
theory  to  number  theory.  We  consider  the  question  of  the  distribu- 
tion of  prime  numbers  among  the  natural  numbers.  This  is  a very 
complicated  question  and  so  our  reasoning  will  not  be  rigorous, 
though  it  will  give  a good  general  picture  of  the  subject. 

In  ancient  times  it  had  already  been  proved  that  there  does 
not  exist  a greatest  prime  number:  hence,  there  is  an  infinity  of 
primes.  Even  a cursory  glance  at  the  first  several  hundred  prime 
numbers  of  the  positive  integers  convinces  us  that  the  primes  are 
distributed  in  a highly  irregular  fashion.  For  example,  following 
the  prime  113  come  13  composite  numbers,  the  fourteenth  number 
(127)  being  prime.  There  are  only  three  composite  numbers  between 
the  prime  127  and  the  prime  131.  There  are  five  numbers  between 
the  primes  131  and  137,  and  then  the  next  prime  is  139. 

And  so  we  have  the  interesting  question  of  how  many  primes 
there  are  between  the  numbers  n and  n + A.  We  will  use  the  term 
distribution  density  of  primes  for  the  ratio  of  the  number  of  primes 
between  n and  n + A to  the  number  A.  This  section  will  be  devoted 
to  determining  the  dependence  of  the  distribution  density  of 
primes  on  n. 

We  consider  A to  be  small  in  comparison  with  n but  substan- 
tially greater  than  1.  (We  thus  assume  that  n is  a sufficiently 
large  number.)  Under  this  assumption  there  are  many  primes  be- 
tween n and  n + A.  We  denote  the  distribution  density  of  primes 
by  /(«).  Then  there  will  be  f{n)  A primes  in  the  interval  between  « 
and  n + A. 

We  can  make  a list  of  all  prime  numbers  not  exceeding  n in  the 
following  manner. 

Write  down  all  the  natural  numbers  from  2 to  n.  Then  cross 
out  all  numbers  divisible  by  2 (not  counting  the  number  2 itself), 
then  cross  out  all  numbers  divisible  by  3,  except  3 itself,  then  all 
numbers  divisible  by  5,  except  5,  and  so  forth.  In  this  way  we  cross 
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out  all  composite  numbers  and  leave  only  primes.  This  method 
goes  by  the  name  of  the  sieve  of  Eratosthenes . 

Clearly,  1/2  of  all  the  numbers  from  2 to  n are  divisible  by  2, 
only  1/3  of  these  numbers  are  divisible  by  3,  1/5  by  5,  etc.  And  so 

if  we  are  considering  the  natural  numbers  from  2 to  n,  then  th 

part  of  them  is  divisible  by  the  prime  number  p and  1 1 — j th 

part  is  not  divisible  by  p . Here  we  assume  that  p<^n. 

Suppose  we  have  two  primes,  px  and  p2.  Take  any  natural 
number  n.  It  may  or  may  not  be  divisible  by  pv  The  same  goes 
for  p2 . We  assume  that  the  two  events  — the  first  being  that  n is 
not  divisible  by  px  and  the  second  that  n is  not  divisible  by  p2  — are 
independent. 

Of  the  numbers  from  2 to  n,  the  portion  of  numbers  that  are  not 
divisible  by  2 is  equal  to  |l  — yj,  the  portion  of  numbers  that  are 

not  divisible  by  3 is  equal  to  |l  — yj,  the  portion  of  numbers  that 
are  not  divisible  by  5 is  equal  to  |l  — yj,  and,  generally,  the  por- 
tion of  numbers  that  are  not  divisible  by  the  prime  p is  1 1 — •- J * 

Since  we  assume  as  independent  the  events  consisting  in  the  fact 
that  a number  is  not  divisible  by  various  smaller  primes,  then  by 
the  law  of  compound  probabilities  the  portion  of  primes  located 
between  2 and  n is  equal  to 

(We  equated  this  portion  to  the  distribution  density  of  primes  f(n) 
on  the  basis  of  the  approximate  equality  F{n)jn  « F'(n ) for  large 
F(n)  and  n.) 

If  p > 1,  then  e p & 1 — — (we  disregard  the  terms  1 [p2f  1 [p3,  ...). 

P 

Therefore 
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1 1 1 

The  formula  (32)  takes  the  form  f(n)  = e 2 • e 3 ...e  p , whence 
In /(n)  = (-!)  + (-!)  + ...  + (-!)=_£  J.  (33) 


In  the  interval  between  v and  v + iv  (where  rfv  is  small  compared 
with  v)  there  are  /(v)iv  prime  numbers.  In  view  of  the  fact  that 
the  interval  is  short  in  length,  we  can  assume  that  all  primes  in 

this  interval  are  approximately  equal  to  v and  therefore  — is  — for 

Pi  V 

this  interval.  Then  for  numbers  in  the  interval  from  v to  v-f  rfv, 

S — takes  the  form  — /(v)  dv.  Consequently,  for  the  sum  £ — 
Pi  v pt 

that  corresponds  to  the  interval  from  2 to  n we  get 

£ J-J™.* 

i Pi  J v 

a 

What  are  the  limits  of  integration  in  this  integral?  We  are 
interested  in  all  primes  not  exceeding  ny  and  so  we  take  n for  the  upper 

i 

limit  of  integration.  Note  that  the  equation  1 — — - = e Pi  is  good 

Pi 

only  for  large  pt.  If  is  small,  then  the  last  formula  is  noticeably 

i 

l — 2 

in  error.  For  example,  for  p — 2 we  get  1 — — = 0.5,  e = 0.61 

(the  error  exceeds  20%).  Replacing  the  sum  by  an  integral  is  also 
good  only  for  intervals  where  pt  is  sufficiently  great.  It  is  therefore 
impossible  to  indicate  the  lower  limit  of  integration.  We  will  leave 
the  lower  limit  of  integration  as  being  indefinite.  Formula  (33)  takes 
the  form 


n 

In  f{n)  = - (j  rfv 


a 


(34) 


Taking  the  derivative  of  both  sides  of  this  equation,  we  get 
_J_  df(n)_  _ _ /W_ . have  an  equation  with  variables  separable 

f(n)  dn  n 

that  is  easy  to  solve.  Rewrite  it  thus: 

df(n)  dn 

f2(n)  n 


Integrate  both  sides  from  n0  to  n to  get 


« M 


«0  »0 


1 

7w 


i 

/(»•) 


— In  n + In  n0 


n 


or 


564 

whence 


Theory  of  probability 


CH.  13 
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/(«)  /(»#) 

where  C = — In  n0 

f(«o) 

We  finally  get 


— In  n0  + In  n = C -f  In  n 


f(n)  = 


l 

C + lnw 


(35) 


In  the  case  of  large  n,  the  constant  C may  be  neglected  compared 
with  In  n.  And  so  for  very  large  n we  have 

m=y-  (36) 

In  n 


Our  derivation  was  extremely  crude  and  inaccuracies  in  it  can 
easily  be  pointed  out.  For  example,  was  it  necessary  to  test  all  primes 
less  than  n for  divisibility?  Of  course  not.  Actually,  it  would  suffice 
to  test  only  those  primes  that  do  not  exceed  ]/  n.  Suppose  n is  not 
divisible  by  any  prime  less  than  1/  n.  Suppose  n is  divisible  by  the 
prime  px>  J / n.  Then  njp1  = k and  since  p1  > ]/  n,  it  follows  that 
k < y n.  Note  that  n = pxk,  whence  njk  = pv  Consequently,  n is 
divisible  by  k < ]/  n,  which  runs  counter  to  the  hypothesis. 

It  turns  out  (if  we  reject  all  terms  except  the  principal  one) 
that  (36)  also  satisfies  the  modified  relation  (34)  in  which  the  upper 
limit  n of  the  integral  is  replaced  by  j In.  On  the  basis  of  entirely 
different  ideas,  it  has  been  demonstrated  (in  a much  more  involved 
manner)  that  the  accuracy  of  formula  (36)  for  large  n is  very  great 
and  better  than  could  have  followed  from  our  reasoning.  In  parti- 
cular, it  turned  out  that  (35)  offers  the  best  asymptotic  representation 
of  f{n)  precisely  for  C = 0. 

Very  often  the  question  investigated  is  not  the  density  of  primes 
but  the  number  of  primes  A(n)  not  exceeding  a given  number  n . It 

n 

is  clear  that  Atyi)  = ^/(v)  dv,  where  the  lower  limit  of  integration  a 

a 

is  not  known.  Thus 

A{n)=\f- 

J In  v 

a 

Put  v = nyt  dv  = n dy  here ; then 

A (n)  = n [ — 

J In  n + In  y 

a 

n 
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Note  that  for  large  n 


1 - 1 . 1 - 1 [l 

In  y 

In*,  I 

In  n + In  y lnw  ^ ^ Iny  in  n [ 

In  n 

And  so 

In  n 

* In2  n J 

A(n)=  ”([}  lny+ln^ 

ln»  J [ In  n In2  n 

a 

— . 

...j  dy 

n 

Confining  ourselves  to  the  first  term 

in 

the  expansion 

and 

replacing  — by  0 when  n is  large,  we  get 

n 

m = y- 

In  n 

(37) 

If  we  take  a large  number  of  terms,  the  formula  becomes  more 
exact.  For  example,  confining  ourselves  to  three  terms  of  the  expan- 
sion, we  get 

Mn)=-?-\ l-f-L.  + .J-l*  (38) 

In  n |_  In  n ln^  n J 

It  turns  out  that  the  error  of  any  one  of  such  formulas  is  asympto- 
tically small  compared  with  the  last  "exact”  term. 

Exercises 

1.  Approximate  the  number  of  primes  lying  in  the  interval  between 
3000  and  3100,  between  3000  and  3200,  and  between  3000  and 
3500.  Compare  the  result  with  the  true  figure  by  counting  the 
number  of  primes  in  a table  of  prime  numbers. 

2.  Determine  the  number  of  primes  less  than  4000  by  the  formula 
(37).  Refine  the  result  by  retaining  in  (38)  the  term  on  the  right 

containing  — — , and  then  the  term  containing  — — . Compare 

In  n (In  n)2 

the  result  with  the  exact  figure  obtained  by  counting  the  number 
of  primes  less  than  4000  in  a table  of  primes. 

3.  Calculate  the  number  of  primes  in  the  interval  between  2000 
and  5000.  Perform  the  calculations  in  different  ways: 

(a)  by  assuming  that  n = 2000,  A = 3000 ; 

(b)  by  finding  the  difference  between  .4(5000)  and  ^4(2000). 
Compute  these  quantities  using  (37)  and  then,  more  exactly, 

by  retaining  the  term  containing  — — • 


i i 

Here  we  had  to  find  ^ In  y dy  and  J In2  y dy.  Both  integrals  can  readily  be 
o o 

calculated  by  integration  by  parts.  Note  that  y In*  y — 0 for  y = 0 for  any  k. 
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Table  of  the  Probability  Integral 

4,(*)— mV** 

0 


* 

X 

®(*> 

X 

X 

o.eo 

0.000 

01 

0.008 

0.36 

0.281 

0.71 

0.522 

1.06 

0.711 

02 

0.016 

37 

0.289 

72 

0.528 

07 

0.715 

03 

0.024 

38 

0.296 

73 

0.535 

08 

0.720 

04 

0.032 

39 

0.303 

74 

0.541 

09 

0.724 

05 

0.040 

40 

0.311 

75 

0.547 

10 

0.729 

06 

0.048 

41 

0.318 

76 

0.553 

11 

0.733 

07 

0.056 

42 

0.326 

77 

0.559 

12 

0.737 

08 

0.064 

43 

0.333 

78 

0.565 

13 

0.742 

09 

0.072 

44 

0.340 

79 

0.570 

14 

0.746 

10 

0.080 

45 

0.347 

80 

0.576 

15 

0.750 

11 

0.088 

46 

0.354 

81 

0.582 

16 

0.754 

12 

0.096 

47 

0.362 

82 

0.588 

17 

0.758 

13 

0.103 

48 

0.369 

83 

0.593 

18 

0.762 

14 

0.111 

49 

0.376 

84 

0.599 

19 

0.766 

15 

0.119 

50 

0.383 

85 

0.605 

20 

0.770 

16 

0.127 

51 

0.390 

86 

0.610 

21 

0.774 

17 

0.135 

52 

0.397 

87 

0.616 

22 

0.778 

18 

0.143 

53 

0.404 

88 

0.621 

23 

0.781 

19 

0.151 

54 

0.411 

89 

0.627 

24 

0.785 

20 

0.159 

55 

0.418 

90 

0.632 

25 

0.789 

21 

0.166 

56 

0.425 

91 

0.637 

26 

0.792 

22 

0.174 

57 

0.431 

92 

0.642 

27 

0.796 

23 

0.182 

58 

0.438 

93 

0.648 

28 

0.799 

24 

0.190 

59 

0.445 

94 

0.653 

29 

0.803 

25 

0.197 

60 

0.451 

95 

0.658 

30 

0.806 

26 

0.205 

61 

0.458 

96 

0.663 

31 

0.810 

27 

0.213 

62 

0.465 

97 

0.668 

32 

0.813 

28 

0.221 

63 

0.471 

98 

0.673 

33 

0.816 

29 

0.228 

64 

0.478 

99 

0.678 

34 

0.820 

30 

0.236 

65 

0.484 

1.00 

0.683 

35 

0.823 

31 

0.243 

66 

0.491 

01 

0.687 

36 

0.826 

32 

0.251 

67 

0.497 

02 

0.692 

37 

0.829 

33 

0.259 

68 

0.503 

03 

0.697 

38 

0.832 

34 

0.266 

69 

0.510 

04 

0.702 

39 

0.835 

35 

0.274 

70 

0.516 

05 

0.706 

40 

0.838 
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Continued 


X 

®M 

X 

®M 

X 

®M 

X 

®M 

1.41 

0.841 

1.66 

0.903 

1.91 

0.944 

2.55 

0.989 

42 

0.844 

67 

0.905 

92 

0.945 

60 

0.991 

43 

0.847 

68 

0.907 

93 

0.946 

65 

0.992 

44 

0.850 

69 

0.909 

94 

0.948 

70 

0.993 

45 

0.853 

70 

0.911 

95 

0.949 

80 

0.995 

46 

0.856 

71 

0.913 

96 

0.950 

90 

0.996 

47 

0.858 

72 

0.915 

97 

0.951 

3.00 

0.997 

48 

0.861 

73 

0.916 

98 

0.952 

10 

0.998 

49 

0.864 

74 

0.918 

99 

0.953 

20 

0.999 

50 

0.866 

75 

0.920 

2.00 

0.954 

30 

0.999 

51 

0.869 

76 

0.922 

05 

0.960 

40 

0.999 

52 

0.871 

77 

0.923 

10 

0.964 

3.50 

1-0.5-  10-3 

53 

0.874 

78 

0.925 

15 

0.968 

3.9 

1 — 10-4 

54 

0.876 

79 

0.927 

20 

0.972 

4.4 

1 — 10-5 

55 

0.879 

80 

0.928 

25 

0.976 

4.9 

1-10-6 

5.3 

1-10"7 

56 

0.881 

81 

0.930 

30 

0.979 

57 

0.884 

82 

0.931 

35 

0.981 

58 

0.886 

83 

0.933 

40 

0.984 

59 

0.888 

84 

0.934 

45 

0.986 

60 

0.890 

85 

0.936 

50 

0.988 

61 

0.893 

86 

0.937 

62 

0.895 

87 

0.939 

63 

0.897 

88 

0.940 

64 

0.899 

89 

0.941 

65 

0.901 

90 

0.943 
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_ 1 — A 
Wx>  ~ 7 ~~  7 

and  the  probability  of  white  is 
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Since  = ww  • ww,  it  follows  that 


Similarly 


The  probability  that  white  turns  up  once  and  black  the  next 
time  (the  order  in  which  they  appear  is  immaterial)  can  be  com- 
puted from  the  general  formula  — — amp*.  We  get  — • 

m\k\  9 

2 4 

4,  Using  the  general  formula  we  find  — > — • 

5.  0.18.  6.  0.243,  0.027.  7.  0.29,  0.33. 

Sec.  13.3 

1.  Using  formula  (3)  and  setting  n = 1000,  8 = 0,  and  then  n = 1000, 
$ = 10,  we  obtain  0.025,  0.020. 

2.  The  probability  of  a certain  number  of  heads  not  exceeding 
500  is  equal  to 


U -J-  OO  I*  CO 

= -L=  { e 2 dt  = — ==  C e 2 dt  = - -L=  { 
V 2tz  j V 2tt  J 2 y 27T  J 


e 2dt  = - 


The  probability  of  obtaining  at  least  500  heads  is 


1_1  = 1 

2 2 


The  probability  of  obtaining  not  more  than  510  heads  (8  = 10)  is 


yio 


e 2 dt 


e dt 


2 V2* 


OQ  U.TO 

j e 2 dt  + \y=  ^ *_T dt  = 0.5  + 2 • 0.471  = 0.74 


Therefore  the  probability  of  obtaining  at  least  510  heads  is 
1 — 0.74  = 0.26. 
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3.  Putting  in  (8)  8 = 0 (10  hits),  we  getw  = 0.13,  putting  8 = — 2 
(8  hits),  we  get  = 0.11. 

4.  Denote  by  8 the  deviation  of  the  number  of  hits  from  the  most 
probable  number.  Then  the  probability  that  one  of  the  events 
for  which  8 ^ 80  is  realized  is  equal  to 


w 


So 


y 27tna(3  J 

— 00 


52 

e 2waPrf8 


Putting  , = t}  d8  = y wa|3  dt  here,  we  get  w = —=  \ e 2 dt, 

];wap  y 2tt  J 

— 00 

where  t0  = -~=.  In  this  case,  n = 100,  a = 0.1,  (3  = 0.9.  The 
y wap 

most  probable  number  of  hits  is  n<x  = 100-  0.1  = 10.  Compute 
the  probability  that  the  number  of  hits  does  not  exceed  8 (8  = 
= — 2).  We  get 


-0.67 


W 


-^S 


dt 


+ co 

4+ 


dt 


1 

fin 


P + OO  f-  v.u/  *“ 

7T  ^ e 2dt  — 2 dt 

- 0 0 


0.67  f* 

T 


= 0.5  — 0.5  0(0.67) 


0.67 

0.25 


Hence  the  probability  that  the  number  of  hits  is  not  less  than 
8 is  equal  to  1 — 0.25  = 0.75. 

For  the  other  two  cases  the  probabilities  are  respectively  0.5 
and  0.25. 

5.  0.73,  0.37. 


Sec.  13.5 


1. 


w^{m) 
wii(m  — 1) 


— — — — = — , whence  w^m)  > w^m  — 1)  as 

e-*  m 


long  as  m < jx,  and  wjyn)  < w^m  — 1)  when  m > p,.  Hence, 
if  [x  is  nonintegral,  then  w^m)  is  the  largest  when  m is  equal  to 
the  integral  part  of  jx.  But  if  p,  = 1,  2,  ...,  then  the  largest 
are  w^y.  — 1)  = w^y.). 

2.  If  a large  number  N of  trials  is  carried  out,  the  variable  x as- 
sumes the  value  xx  a total  of  p,N  times,  the  value  x2,p2N  times, ... , 
and  the  value  xn,pnN  times.  Therefore  the  mean  value  (per  trial)  is 


- _ pi  Xj  + faNjJft  + — + PnHj_*n 


= Pl%l  + Pi**  + •••  + Pnxn 


N 
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For  the  Poisson  law  this  yields 

oo  m,  oo 

m=  Vs  = Y' 

where  m — 1 = 

Sec.  13.6 

The  system  of  equations  is  of  the  form 

$ = -(«  + «)  A + «A-i  + *A+i  (*  = -2,  -l,  o,  i,  2, ...) 

at 

This  system  has  to  be  solved  for  the  initial  condition  p0( 0)  = 1, 
A(0)  = 0(i  =£  0).  When  using  the  method  of  succesive  approx- 
imation it  is  convenient  first  to  make  a change  of  variable, 
= e~{a+0i)t  qif  whence 

$ = + a9i+1  (»  = ....  -2,-1,  0,  1,  2,  ...) 

at 

The  zero-order  approximation  is 
q0  = 1,  and  the  others  are  ^ = 0. 

The  first  approximation  is 

jLj  — a/,  = 1,  = co/,  and  the  others  are  qt  = 0. 

The  second  approximation  is 

a2/2  ^2^2 
q~2  = -y- , ?_1=  atf,  ?o  = 1 + aco*2>  Si  = q2  = — and 

the  others  are  qt  — 0,  and  so  forth.  The  mean  value  of  the 
number  of  a state  i = (co  — a)  t.  The  problem  discussed  in 
the  text  is  obtained  for  a = 0,  co  = 1. 


CO  k + 1 


(m  — 1)! 


— a u, 

« = t* 


Sec.  13.7 

F. zip) 


± (-ip*  + 6pq-W) 

± (p*  - 6 pq  + V) 

2<P 


if  0 < p < q, 
if  q < p < 2q, 
if  Iq  < p < 3^ 


lip  <0  and  if  p > 3 q,  then  F3(p)  =0.  The  graph  of  the  function 
Fs(p)  for  q = 1 is  shown  in  Fig.  191. 
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Sec.  13.8 


xF(x)  dx  = ■ 


(x~*y 

: 2A2  dx. 


Put  dx  = 2 dt.  Then  the  last  integral  becomes 


“T  ^ 

^ (x  + Ay It)  e~tl  dt  = x + ^ te~n  < 


The  integral  ^ te~il  dt  — 0 because  the  integrand  is  odd  so  that 

— 00 

its  graph  is  symmetric  about  the  origin. 

+ CO  ^ J (x-x)Z 

To  find  ^ (x  — x)2-^=ze  2A2  dx  make  the  change  of  variable 

— 00 

^ * = t,  dx  = Af  2 dt.  Then  the  original  integral  becomes 

+ 00 

2A2  C <2 

— =-  \ dt.  Integrate  by  parts  and  put  dt  ~ dg,  t = /„ 

K 7T  J 

— oo 

Then  g = — -i  , df  = dt.  We  get 

[ tH~"  dt  = f - - te~tz  +"  + - ( e“'2  rf/1  = A2 

V*  3 2 -»T  2 3 

— co  L.  —co  J 

2.  We  find  p1  = — ; A?  = — and  so 
^ 2 1 12 


I0>  = -f  y^; 


3(P-Sq)2 

5?2 

? 

3(/> — 1 Og)2 
10^2 


3.  (a)  Since  px  = 1 kg  and  pn  = npv  it  follows  that 

P20  = 20  • 1 kg  = 20  kg. 

(b)  Take  advantages  of  the  results  of  (a)  setting  q equal  to  2 in 
the  expression  for  F (p ; 20)  to  get 


F(p;  20)  =1|/- 


3(£— 20)2 
40 


F(p ; 20)  = 0. 154e 


-0.075(^-20)2 
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The  probability  that  the  catch  will  not  exceed  20  kg  is 

-0.075(^-20)2 

w = 0.154  \ e dp,  In  order  to  reduce  this  integral 

o 

to  the  tabulated  function  <S>(x)  put  y x2  = 0.075(p  — 20)2,  whence 

x = 0.387 (p  - 20),  dx  = 0.387^. 

Therefore 

0 7.7  _z2  

w=  — ( e 2dx  = 0A[e  2 dx=  0.4^-<D(7.7)  = 0.5 
0.387  J J 2 ' ' 

-7.7  0 

The  probability  that  the  catch  does  not  exceed  22  kg  is 

+ 0.77  ✓ 0 * 2 0.77  *2 

w ~ 0.4  ^ £ 2 = 0.4  j ^ e 2 dx  + ^ e 2 dx 

—7.7  \ —7.7  0 

= 0.4  ~~  [0(7.7)  + 0(0.77)]  = 0.78 

The  probability  that  the  catch  does  not  exceed  25  kg  is  0.98. 

Sec.  13.10 

1.  Taking  n = 3000,  A = 100,  200,  500,  we  get  the  approximate 
values  12,  25,  62.  The  exact  values  are  equal  to  12,  22,  59  respec- 
tively. 

2.  By  formula  (37)  .4(4000)  = 482.  By  the  condensed  formula  (38), 
A = 540.  By  the  complete  formula  (38),  A — 554.  The  exact 
value  is  .4(4000)  = 550. 

3.  Taking  n = 2000,  A = 3000,  we  get  N = 395.  By  formula  (37) 
N = 322.  By  the  condensed  formula  (38),  N = 358.  The  exact 
value  is  N 366. 


Chapter  14 

FOURIER  TRANSFORMATION 


14.1  Introduction 

In  our  discussion,  in  Chapters  7 and  8,  of  linear 
differential  equations  with  constant  coefficients 
and  also  of  systems  of  such  equations  (in  other 
words,  in  the  consideration  of  linear  processes 
homogeneous  in  time  with  a finite  number  of 
degrees  of  freedom)  we  saw  how  important  was 
the  role  of  the  function  ept.  The  point  is  that  if 
we  put  this  function  in  the  left  member  of  the 
equation  and  perform  all  operations,  the  result 
will  be  the  very  same  function  multiplied  by  a 
constant  factor.  For  this  reason  the  solution  of 
a homogeneous  equation  was  expressed  in  terms  of  functions  of  the 
form  ept,  so  also  was  the  Green's  function  expressed  in  terms  of  these 
functions ; what  is  more,  the  solution  of  a nonhomogeneous  equation 
was  of  the  simplest  form  when  the  right  member  contained  the  func- 
tion ept . We  also  include  here,  as  a special  case,  functions  of  the  form 
cos  (At  or  eyt  sin  (At , which  by  the  Euler  formula  are  expressed  directly 
in  terms  of  the  exponential  ept  with  complex  p (see  formulas  (3)). 

In  many  problems  the  independent  variable  t — ordinarily  the 
time  — can  assume  all  values,  that  is,  — oo  < t < oo,  and  the  solu- 
tions must,  according  to  the  meaning  of  the  problem,  remain  finite 
both  for  finite  t and  for  ^->±00.  Since  for  p = y + «*>  ePt  = 
= e^(cos  (At  + i sin  (At),  | cos  (At  -f-  i sin  (At  | = 1,  it  follows  that  for 
v > 0 we  will  have  | ept  j^oo  -►  00  and  for  y < 0 it  will  be  true 
that  | !*_►_«  -►oo.  Hence,  if  we  require  the  exponential  ept  to  be 
bounded  as  t ± 00,  then  it  must  be  true  that  y = 0,  that  is  to 
say,  we  must  make  use  only  of  “harmonics",  or  the  functions 

eicit  = cos  (At  + i sin  (At  (1) 

When  harmonics  with  different  amplitudes  and  frequencies  are 
superimposed  on  one  another,  that  is,  when  we  consider  sums  of  the 

m = E ^ w 

k 

* In  this  chapter  we  will  have  need  of  complex  numbers  (Secs.  5. 1-5.3)  and  the 
delta  function  (Secs.  6. 1-6.3).  In  Sec.  14.7  use  is  made  of  the  concept  of  a multi- 
dimensional vector  space  (Sec.  9.6). 
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form,  we  can  get  functions  of  a much  more  complicated  nature.  The  set 
of  frequencies  o)k  considered  here  is  called  the  spectrum  of  the  function 
f(t) ; in  this  case  we  have  a discrete  spectrum .*  We  can  also  make  use 
of  sums  of  sines  or  cosines  with  distinct  frequencies  and  amplitudes, 
for  by  formula  (1)  we  can  pass  from  exponentials  to  cosines  and 
sines,  and,  by  the  formulas 

J<*t  , -itot  tat  __  ~iot 

cos  ot  = — 9 sin  o >t  = (3) 

2 2 i w 

from  cosines  and  sines  to  exponentials.  For  the  sake  of  uniformity 
we  will  mainly  use  exponentials  (cf.  Sec.  5.5). 

A still  broader  class  of  functions  f(t)  is  obtained  if  in  place  of 
a sum  over  the  individual  frequencies  we  take  advantage  of  the  inte- 
gral over  all  frequencies,  which  is  to  say,  if  we  make  use  of  a repre- 
sentation like 

00 

f{t)  = ^ F(u>)  do,  (4) 

— 00 

Here  we  have  a continuous  spectrum , which  may  occupy  the  entire 
<o-axis  or  some  interval  J on  that  axis  if  the  function  F(o>)  is  zero 
outside  the  interval;  which  means  that  actually  the  integration  (4) 
is  carried  out  only  over  J (then  J is  termed  the  carrier  of  the  func- 
tion F(co)).  Incidentally,  in  Sec.  6.1  we  saw  that  if  the  function  F( to) 
is  a sum  of  delta-like  terms,  then  the  entire  integral  (4)  passes  into  the 
sum  (2),  which  means  the  spectrum  will  be  discrete  and  will  consist 
of  the  set  of  those  values  of  co  at  which  the  delta-like  terms  have  sin- 
gularities. In  the  general  case,  the  spectrum  can  have  both  a conti- 
nuous and  a discrete  part. 

It  is  not  difficult  to  grasp  the  meaning  of  the  function  F(co)  in 
the  representation  (4).  Since  in  this  representation  we  have  the  term 
F(co)  eiat  do)  = [F(co)  rfco]  eiat  on  the  small  frequency  interval  from 
co  to  co  + do*,  it  follows,  by  comparison  with  (2),  that  F( co)  do)  is  the 
amplitude  corresponding  to  the  indicated  interval  of  frequencies. 
Hence  F(co)  may  be  regarded  as  the  “density  of  the  amplitude”  cor- 
responding to  a small  frequency  interval  and  calculated  per  unit  length 
of  this  interval.  That  is  why  the  function  F( co)  is  called  the  spectral 
density  of  the  function  f(t).  Transition  from  the  sum  (2)  to  the  integral 
(4)  is  similar  to  the  transition  (cf:  Sec.  12.1)  from  the  discrete  model 
of  a string  in  the  form  of  elastically  connected  beads  to  the  continuous 
model,  in  which  the  mass  is  spread  out  over  the  entire  length  of  the 
string  with  a definite  density.  So  also  in  the  representation  (4),  the 
amplitude  of  the  harmonics  is  spread  out  over  the  entire  frequency 
spectrum  with  density  F(co). 


Take  care  to  distinguish  between  the  spectrum  of  a function  which  we  deal  with 
in  this  chapter  and  the  spectrum  of  a boundary- value  problem  (page  300). 
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In  physical  problems  we  mostly  deal  with  real  functions  f{t), 
for  example,  when  f(t)  is  a force  / acting  on  a system  as  a function 
of  the  time  t. 

The  reality  condition  of  f(t)  may  be  written  thus: 

+ oo  + oo 

/(/)  = /*(/),  that  is  i F(co)  eioit  c/co  = [ F*(co)  e-ioit  c/co 


We  do  not  put  the  conjugacy  symbol  on  co  because  it  is  assumed  that 
co  is  real.  For  the  sums  of  the  harmonics  to  be  equal  for  any  values 
of  /,  it  is  necessary  that  the  corresponding  amplitudes  be  equal.  Let 
us  find  the  factor  in  front  of  e ; in  the  left-hand  integral  this  is 
F(co0).  In  the  right-hand  integral,  from  the  condition  e~ioit  — ei0i^ 
we  find  co  = — co0;  hence,  the  factor  in  front  of  e™**  in  the  right-hand 
integral  is  equal  to  F*( co)  = F*(—  co0).  Thus,  the  condition  of  reality 
of  the  function  /(/)  yields 

F*(-co0)=F(co0) 

This  equality  refers  to  any  co0  and  so  we  can  drop  the  subscript  0 
and  write 

*••(-<■>)  =*■(<■>)  (5) 

The  function  F(co)  is  a complex  function.  Let  us  write  out  expli- 
citly the  real  and  imaginary  parts  with  the  aid  of  two  real  functions 
^4(co)  and  B(co) : 

F(  co)  — ^4  (co)  + iB{  co) 

The  reality  condition  of  the  original  function  /(/)  that  produced  for- 
mula (5)  leads  to 

A(— co)  — iB(— co)  = ^4(co)  -f-  iB{  co) 

that  is 

A(-u)  = A(  co),  B(- co)  = ~B{  co)  (6) 

Thus,  the  real  part  of  F is  an  even  function  of  co  and  the  imaginary 
part  of  F is  an  odd  function  of  co. 

From  the  representation  with  the  aid  of  eio>t  let  us  pass  to  the 
representation  with  the  aid  of  cos  co/  and  sin  co/: 

+ 00  +O0 

/(/)  = ^ F( co)  ei(0t  c/co  = ^ [^4(co)  + iB( co)]  [cos  co/  + i sin  co/]  c/co 


+ QO 
= $ 


cos  co  / c/co 


- \ B( co) 


— 00 


sin  co/  c/co 


+ 00  +00 

4“  / ^ ^4(co)  sin  <*>/  c/co  + / ij  B(co)  cos  co/  c/co 

— 00  — oo 
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Let  us  consider  a special  but  important  case  of  the  real  function 
f(t).  Then  the  conditions  (6)  will  make  the  last  two  integrals  vanish. 
They  contain  the  product  of  an  even  function  by  an  odd  function 

o 0O  +00 

so  that  ^ k rfco  = — ^ & do,  ^ kdoi  — 0,  where  k is  the  integrand  of 

— oo  0 — oo 

the  third  or  fourth  integral. 

Contrariwise,  the  integrand  in  the  first  two  integrals  is  symmetric : 

0 oo 

7z(w)  = n( — co),  ^ n( co)  dco  = ^ w(co)  do*, 

— oo  0 

+ 00  00 

^ n( co)  do*  = 2 ^ n( co)  doi 


so  that  finally 

00  00 

m = 2 ^ A (co)  cos  (xit  dc o — 2 ^ i?(co)  sin  corf  c/co 


(?) 


To  summarize,  the  real  function  f(t)  is  represented  in  the  form  of  an 
integral  of  the  real  functions  cos  co2  and  sin  co£,  the  corresponding 
spectral  density  of  ^4(co)  and  B( co)  is  also  real.  In  this  case  the  inte- 
gration is  performed  only  over  the  positive  frequencies  with  co  varying 
from  0 to  oo. 

It  is  a remarkable  fact  that  it  is  possible  to  represent  as  (4)  almost 
any  function  f(t)  that  remains  finite  as  t ->  ± o°.  This  will  be  shown 
in  Sec.  14.2  where  we  will  also  answer  the  basic  question  of  how  to 
find  the  spectral  density  F( co)  of  the  given  function  f(t).  We  will  see 
right  now  that  this  question  goes  far  beyond  being  of  merely  theore- 
tical interest. 

Recall  the  simple  facts  of  the  theory  of  oscillations  (Secs.  7.3, 
7.5;  also  see  HM,  Chs.  6 and  8).  Suppose  we  have  an  oscillator  with 
slight  damping,  i.e.,  little  friction  (if  we  are  dealing  with  a mecha- 
nical oscillatory  system)  or  with  low  resistance  (if  it  is  an  electric 
oscillatory  circuit),  and  so  on.  If  such  an  oscillator  experiences  a har- 
monic external  force  with  frequency  co*,  then  forced  harmonic  oscil- 
lations with  the  same  frequency  will  build  up  in  the  oscillator.  The 
amplitude  of  these  oscillations  is  the  greater,  the  closer  co  is  to  the 
frequency  co0  of  the  natural  oscillations  of  the  oscillator.  This  "selec- 
tivity” of  the  oscillator  relative  to  the  frequency  of  the  external  action 
is  expressed  the  more  sharply,  the  less  the  damping.  In  the  limit, 
when  we  consider  an  oscillator  without  damping,  for  co  = co0  resonance 
sets  in,  which  means  the  amplitude  of  the  forced  oscillations  increases 
without  limit. 
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Now  suppose  the  harmonic  external  action  is  applied  to  a system 
of  oscillators  with  small  damping  that  have  distinct  natural  frequen- 
cies. Then  the  oscillator  whose  natural  frequency  is  equal  to  the  fre- 
quency of  the  external  action  will  respond  most  strongly  to  this 
action.  Finally,  if  such  a system  of  oscillators  is  acted  upon  by  a 
mixture  of  harmonics,  that  is,  by  a function  of  the  form  (2),  then 
those  oscillators  will  respond  whose  natural  frequency  is  the  same  as 
one  of  the  external  frequencies  <0*.  Here  the  amplitude  of  the  forced 
oscillations  of  the  oscillator  with  natural  frequency  co0  is  proportional 
to  the  amplitude  of  the  action,  that  is,  it  is  proportional  to  the  ak  for 
which  cofc  = 50.  Thus,  such  a system  of  oscillators  realizes  a harmonic 
analysis  (the  term  Fourier  analysis  is  also  used;  this  is  a kind  of 
spectral  analysis)  of  the  external  action,  which  amounts  to  breaking 
the  action  up  into  its  component  harmonics.  A similar  situation  results 
in  the  case  of  imposing  an  external  action  of  the  type  (4).  The  ampli- 
tude of  an  oscillator  with  natural  frequency  50  is  proportional  to* 
F( co0),  where  F is  the  spectral  density  (see  formula  (4)).  An  exact 
statement  of  the  conditions  necessary  for  this  will  be  given  below. 
With  a set  of  oscillators  having  all  possible  values  £>0,  it  is  possible 
to  “sense”,  i.e.  determine,  the  entire  course  of  the  function  F(co). 

This  kind  of  harmonic  analysis  can  actually  be  accomplished. 
For  example,  in  acoustics  use  is  made  of  a system  of  resonators,  each 
one  of  which  is  tuned  to  a specific  frequency.  If  this  system  is  acted 
upon  by  an  acoustic  (sound)  oscillation,  which  can  always  be  repre- 
sented as  a mixture  of  “pure  sounds”,  or  harmonic  oscillations,  then 
those  resonators  will  respond  whose  natural  frequency  corresponds 
to  the  spectrum  of  the  action.  By  determining  the  amplitude  of  oscil- 
lation of  the  resonators  we  thus  accomplish  the  harmonic  analysis 
of  the  external  action. 

Exercises 

1.  What  must  the  function  F(o)  be  equal  to  for  the  integral  (4)  to 
pass  into  the  sum  (2)  ? 

2.  Under  what  condition  will  the  sum  (2)  be  a periodic  func- 
tion of  t? 

14.2  Formulas  of  the  Fourier  transformation 

We  begin  with  a question  that  will  be  needed  later  on : What 
will  happen  if  the  amplitude  of  oscillation  is  uniformly  spread  over 
the  whole  frequency  axis,  in  other  words,  what  function  f(t)  in  the 
formula  (4)  has  the  spectral  density  F(co)  = 1?  This  means  we  have 
to  investigate  the  integral 
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Although  this  is  a divergent  integral  (Sec.  3.1),  it  can  be  represented 
as  the  limit  of  an  integral  over  a finite  interval : 


where 


I{t)  --  lim  IN{t) 

A7--*  oo 


N 


IN(t)  = ^ eitiit  do*  — 

—N 


JtN 


, ~ UN 


= 2 


sin  tN 


(in  passing  to  the  limit  we  made  use  of  the  second  formula  in  (3)). 
This  result  can  be  represented  as 

IN(t)  = 2n  N = 2t zNF.itN)  (8) 

where  Fx(t)  = . 

TZt 

The  graph  of  the  function  F^t)  is  shown  in  Fig.  192.  It  can  be 
demonstrated  (see  the  solution  of  Exercise  2,  Sec.  3.6)  that 

oo  co  co 

J = 71,  or  ^ F^t)  dt  = J = 1 

— oo  — co  — oo 

The  function  Fx  has  a principal  maximum  at  t = 0,  i71(0)  = -i ; 

1Z 

to  the  left  and  right  it  decreases  in  oscillatory  fashion  (see  Fig.  192). 

From  (8)  we  see  that,  up  to  the  constant  factor  27r,  IN(t)  results 
from  the  function  F^t)  via  the  very  same  transformation  which, 
in  Sec.  6.1,  led  to  the  concept  of  the  delta  function.  Thus,  passing 
to  the  limit,  we  get 

00 

f eioit  du>  = I(t)  = 2: z8(t)  (9) 


In  this  derivation  there  are  two  points  that  require  amplification. 
First,  when  defining  the  delta  function  in  Sec.  6.1  we  used  as  illustra- 
tions only  functions  that  do  not  assume  negative  values.  However 
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Fig.  193 


Fig.  194 


this  condition  actually  was  not  made  use  of  so  that  negative  values  may 
be  admitted.  Second,  and  this  is  more  important,  the  function  F^t)  is 
not  a rapidly  decreasing  function.  Indeed,  the  graph  of  FN(t)=NFx(Nt) 
for  large  N is  shown  in  Fig.  193.  We  see  that  it  oscillates  frequently 
on  any  finite  fixed  interval  tv  t2  not  containing  the  point  t = 0, 
but  the  amplitude  of  oscillations  is  not  small,  i.e.  the  function 

FN{t)  is  not  close  to  zero,  as  in  the  examples  of  Sec.  6.1.  If  we 
+ 00 

took  C \Fn\ dt , then  such  an  integral  would  diverge.  But  this  is 


not  essential,  for  with  increasing  N the  integral  of  the  function  FN(t) 
behaves  in  the  following  manner: 

t t Nt 


- ds  — >■ 

N->  co 


{*  < 0), 
((>0) 


The  graph  of  this  integral  is  shown  in  Fig.  194.  Thus,  for  large  N 
we  have 


t 

^ FN(t)  dtxe(t)  (see  Sec.  6.3),  that  is,  FN(t)xef(t)  = 8(t) 

— CO 
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Thus,  the  delta-like  nature  of  the  function  Fy(t)  for  large  N is  due 
not  to  its  rapid  decrease  but  to  its  frequent  oscillations,  as  a result 
of  which  at  a finite  distance  from  t = 0 the  function  when  integrated 
is  to  all  intents  and  purpose  identically  zero. 

Arguing  in  exactly  the  same  way,  we  could  show  that 

+ » 

^ (?(0  Fn(£  ~ ^o)  & ”*■  Q{t o)  as  N -*■  o o in  accordance  with  the 

— CO 

properties  of  the  delta  function.  In  such  an  integral,  the  major  contri- 
bution is  made  by  the  principal  maximum  of  FN  at  t — t0,  where 
Fn  — N In]  the  “wings”,  that  is,  the  regions  t < t0  — e and  t > t0  + e, 
make  slight  contributions  because  the  integral  of  an  alternating 
function  is  small. 

With  the  aid  of  (9)  it  is  easy  to  consider  the  case  of  spectral 
density  given  by  F(u>)  = Ae{ (A  = constant,  z = constant): 
substituting  into  (4),  we  get 

OO  00 

C Aeio> reiat  do)  = A ( eic°(‘+r>  do*  = 2r.A8(t  + t) 


Conversely,  it  follows  from  this  that  to  the  function  f(t)  = B8(t  — z) 

B 

corresponds  the  density  F(g>)  — — 

We  can  now  pass  to  the  case  of  an  arbitrary  function  f{t)  in 
the  integral  (4).  As  we  saw  in  Sec.  6.2,  every  function  f(t)  can  be 
represented  in  the  form  of  a sum  (to  be  more  precise,  an  integral) 
of  delta-like  functions 

At)  = E[/(ri  8(t  - t) 


By  virtue  of  what  has  just  been  demonstrated,  to  every  term 


there  corresponds  a spectral  density 


f(r)  dr 
2tt 


which  means  that 


to  the  whole  sum  — the  function  f(t)  — there  corresponds  the  spectral 
density 


FM  — 

t 2tz 


If  one  recalls  that  actually  this  is  not  a sum  but  an  integral  and, 
besides,  if  we  denote  the  variable  of  integration  by  t,  then  we  get 


*■(«)  = J f{t)  dt  (10) 


The  formulas  (4)  and  (10)  are  the  Fourier  transformation  formulas. 
Using  them,  we  can  pass  from  any  function  f(t),  which  is  finite  as 
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t -»±  oo  (this  is  necessary  for  the  integral  (10)  to  be  meaningful), 
to  its  spectral  density  and,  conversely,  given  the  spectral  density, 
we  can  restore  the  function.  These  formulas  are  remarkably  symmetric 

to  within  the  constant  factor  — and  the  sign  in  the  exponent. 

2tt 

If  f(t)  tends  to  zero  as  oo  and  as  t ->  -foo,  then  the 

integral  (10)  can,  in  principle,  always  be  computed  at  least  nu- 
merically. 

Let  us  consider  the  special  case  of  a real  function  f(t).  We  write 

e~ia>t  — cos  _ l Sin  tot 

and  obtain 

oo  oo 

F( to)  = — f f{t)  cos  at  dt C f(t)  sin  tot  dt 

2k  J 2k  j 

— 00  —00 

Each  of  the  integrals  is  a real  function.  Recall  how  we  wrote  F( to)  = 
= A (to)  + iB( to).  Clearly, 

+ oo 

A (co)  = — i f{t)  cos  tot  dt, 

2 K J 
— oo 

+ 00 

B(to)  " — f f{t)  sin  totdt 

2k  J 

— oo 

From  this  it  is  easy  to  verify  the  properties  (5)  and  (6)  of  the  density 
F( to)  for  the  real  function  f{t). 

Comparing  (7)  and  (11),  we  are  convinced  that  the  coefficient 
of  cos  tot,  that  is,  A (to),  is  in  turn  expressed  in  terms  of  the  integral 
of  the  function  f(t)  multiplied  by  cos  tot,  similarly  for  B( to). 

Note  two  special  cases  of  the  integral  (10).  Let  f(t)  = C = cons- 
tant. From  the  formula 

f(t)  = C = ^F(<o)  do> 

it  is  clear  that  to  obtain  C on  the  left,  that  is,  Ce°  = Ce'°‘,  we 
have  to  take  F( co)  = C5(<o).  By  the  general  rule, 

S(<o)  tj/(w)  du>  = 'I'(O), 

S(<o)  eiat  do>  = ei0t  = 1 
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Let  us  analyze  f(t)  = De in  similar  fashion.  In  this  case  F(co)  = 
==^Z>8(co  — co0),  which  can  also  readily  be  verified  by  substitu- 
tion into  (4).  The  same  results  can  also  be  obtained  with  the 
aid  of  (9) : 


+ 00 


_c 

2tt 


5 


e ~ ioit  dt  — C8(— co)  = C8( co). 


n 

2tt 


+ 00 

^ _ Wo) 

— 00 


However,  such  integrals  cannot  be  evaluated  directly  and  one  has 
to  consider  the  nondecaying  function  C or ■ De****  as  the  limit  of  a 
decaying  function,  for  instance,  Ce~at2  or  as  a 0.  The 

result  is  a rather  long  and  complicated  chain  of  reasoning.  It  is  better 
to  obtain  these  formulas  from  (9),  as  was  done  above,  and  remember 
them. 

Suppose  an  undamped  oscillator  with  natural  frequency  co0  is 
acted  upon  by  a force  f(t)  so  that  the  equation  of  motion  is  of 
the  form 

m — = —mc^lx  + fit) 
dt 2 0 

We  assume  that  f(t)  — 0 for  t = — oo  and  the  oscillator  was  at  rest 

in  the  equilibrium  position,  x — — = 0.  It  is  easy  to  construct  a 

dt 

solution  of  the  problem  (cf.  Sec.  7.5) : 

t 

x — — [ sin  co0(/  — t)  /(t)  dx 
wco0  J 

— oo 

This  solution  is  constructed  with  the  aid  of  the  Green's  function 
of  the  problem:  an  impulse,  i.e.  a force  8(tf  — t),  acting  on  the  oscil- 
lator at  rest,  imparts  a velocity  — and  free  oscillations  set  in: 

m 

x(t  > t)  = sin  cc*0(t  — t). 

»K00 

Suppose  the  force  f{t)  acts  during  a limited  period  of  time,  / = 0 
for  t>  T.  How  will  the  oscillator  oscillate  when  the  force  ceases 
to  act?  We  expand 

sin  co0(t  — t)  = sin  u>0t  cos  <o0x  — cos  i o0t  sin  co0t 
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to  get 

x(t) 


r t t 

= —5—1  sin  <xi0t  ^ /( t)  cos  w0t  dr  — cos  co0(  ^ f(r)  sin  co0t  dr 

°L  — CO  -00  -* 


T 

Since  f(r)  =0  for  t > T,  we  can  write  C f(r)  cos  wi  dr  instead  of 


^ f(r)  cos  (ot  dr ; the  same  goes  for  the  integral  with  the  sine.  Recal- 

— oo 

ling  the  formulas  (11),  we  get,  for  oscillations  after  the  force  ceases 
to  act, 

x(t)  = [A(g>0)  sin  oiQt  + B(o0)  cos  co02]  for  t > T (12) 

«ico0 


The  amplitude  of  continuous  oscillations  of  an  oscillator  with  natural 
frequency  o0  after  the  force  ceases  to  act  is 


*.  = -2-^  U*  K)  + £2(“o)  =— I FK)  | 

ma0  raco0 


(13) 


Consequently,  the  amplitude  depends  only  on  the  spectral  density 
of  the  acting  force  F(o)  at  the  resonance  frequency,  or  co  = to0.  This 
is  also  demonstrated  by  the  assertions  made  in  Sec.  14.1. 

Physicists  often  regard  |.F(co)|2  and  not  F(o)  as  the  spectral 
density . For  example,  if  the  force  f(t)  is  the  pressure  in  a sound  wave 
or  an  electric  field  in  an  electromagnetic  wave,  then  |F(co)|2<fco 
is  a quantity  proportional  to  the  energy  of  the  sound  or  electromag- 
netic oscillations  over  the  portion  of  the  spectrum  from  o to  o + do. 

After  some  simple  manipulations,  we  can  rewrite  (12)  as 


x(t)  = — -tF(o>0)  eiw»‘  — F(— g>0)  e-™'1], 
imo)Q 

<]  X 

m — = 7r[F(co0)  + F(—o0) 


(H) 


Now  let  us  try  to  solve  the  problem  of  the  oscillations  of  an  oscillator 
directly  with  the  aid  of  the  Fourier  integral.  To  do  this  we  write 


-f  00 


x(t)  ~ ^ AT(to)  eioit  do 


(15) 


and  substitute  this  expression  into  the  oscillation  equation.  So  as 
not  to  deal  with  the  condition  of  rest  at  t = — oo,  we  consider  a 
damped  oscillator.  It  is  clear  that  the  motion  of  such  an  oscillator 
does  not  depend  on  what  occurred  in  the  distant  past. 
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Thus  we  consider  the  equation 

ml^=  ~m<^x  ~hTt+  ^ ( 16) 


Substituting  x and  / in  the  form  of  Fourier  integrals,  we  get 

+ 00  + ao 

^ -X’(co)  (— raco2  + eioit  do>  = J F(o)  eiC)t  do* 

— GO  — ao 


From  this  we  immediately  get 


*H  = 


*(«>) 

*w(co§  — co2)  + ico/i 


(i7) 


If  the  damping  is  slight,  then  X(co)  is  particularly  great  for  those 
values  of  co  for  which  only  the  small  quantity  uoA  remains  in  the 
denominator.  This  occurs  when  cog  — co2  = 0,  i.e.,  when  co  = ±to0. 
Hence  -ST(co0)  and  X(— co0)  are  particularly  great  in  the  Fourier 
transform,  or  image  (see  Sec.  14.4),  of  the  function  x(t).  In  the  limit, 
in  the  case  of  continuous  oscillations,  that  is,  as  0,  the  expression 
(14)  can  be  obtained  from  (15)  and  (17)  by  methods  of  complex 
variable  theory.  We  will  not  go  into  that  any  further  here. 

The  last  remark  concerning  harmonic  analysis  is  that  instru- 
ments ordinarily  record  only  the  amplitude  of  oscillations  of  various 
frequency.  By  (13)  this  means  that  only  the  modulus  (absolute  value)  * 
of  the  Fourier  transform  of  the  force  |F(co)|  being  studied  is  determined. 
This  holds  true  for  the  ear  as  a recorder  of  sound  vibrations,  and 
also  for  the  eye  and  the  spectroscope  (the  spectroscope  only  deter- 
mines the  intensity  of  light  of  various  frequencies). 

In  this  procedure  we  lose  information  concerning  the  phase  of 
F(co).  Imagine  for  instance  two  functions  F1(o>)  and  F2( co)  such  that 
jF^co)  | — |F2(co)  | but  Fx  and  F2  themselves  are  distinct.  This  means 
that  F2(co)  = F1(co)  where  cp  is  any  real  function.  To  these  two 

Fourier  transforms  there  correspond  distinct  fx(t)  and  f2{t) : 

fx{t)  — Jf^co)  eibit  rfco,  f2(t)  — ^F2(co)  eio>t  doj 


But  if  f(t)  is  the  air  pressure  in  a sound  wave,  then  our  ear  perceives 
the  same  sounds  and  is  not  capable  of  distinguishing  between  fx(i) 
and  f2(t).  In  order  to  distinguish  between  fx  and  /*  the  curve  f(t)  has 
to  be  recorded  by  means  of  a high-speed  pressure  gauge.  Only  recently, 
persistent  attempts  have  been  made  to  find  ways  of  studying  the 
phase  F(co)  in  the  case  of  light  waves. 

We  will  come  back  to  the  question  of  phase  in  Sec.  14.8. 


Recall  that  for  a real  force  F(— co)  — F*(co)  so  that  | F(  — to)  | = | F(o>)  j ; the 
sign  of  <o  is  inessential,  it  suffices  to  consider  to  > 0. 
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Exercises 

1.  Find  the  spectral  density  of  the  function  f{t)  = d_al‘l(a  > 0). 
Going  to  the  limit  as  a -*■  -J-  0,  obtain  the  spectral  density  of 
the  function  f{t)  = 1 . For  this  example,  derive  formula  (1) 
of  Ch.  1 from  formula  (4). 

2.  Obtain  in  explicit  form  the  solution,  bounded  as  t->  ± °o , of  equa- 
tion (16). 

14.3  Causality  and  dispersion  relations 

In  Sec.  7.5  we  saw  that  the  solution  of  equation  (16)  is  obtained 
from  the  solution  G(t,  t)  (the  so-called  Green's  function)  of  a similar 
equation  with  a special  nonhomogeneous  term 

m — + h — + mv>%x  = 8(t  — r) 
dt*  dt  0 v J 

via  the  formula 

CO 

x(t)  = 5 G(t,  t)/(t)  d-. 

— oo 

On  the  basis  of  Sec.  6.2,  the  solution  of  any  linear  problem  is  of  a 
similar  aspect;  the  role  of  the  input  signal  being  played  by  f(t),  that 
of  the  response  of  the  system  at  hand  to  this  signal  by  x(t ),  Then  the 
function  G(t,  t)  is  the  response  to  a unit  instantaneous  impulse  acting 
on  the  system  at  time  t. 

In  Sec.  6.2  we  saw  that  the  result  of  action  on  a linear  system 
is  of  a similar  form  also  in  the  case  of  a space  coordinate  serving  as 
the  independent  variable  (it  is  precisely  this  case  that  was  discussed 
in  detail  there).  It  turns  out  that  if  for  the  independent  variable  we 
take  the  time  with  its  specific  single  directionality,  then  the  corres- 
ponding Green's  function  has  important  properties  of  an  extremely 
general  nature,  which  we  will  now  discuss. 

If  the  parameters  of  the  linear  system  at  hand  (say,  an  oscilla- 
tor), which  is  acted  upon  by  signals,  do  not  change  in  the  course 
of  time,  then  all  actions  and  the  corresponding  responses  admit  of 
an  arbitrary  time  shift.  This  means  then  that  G = G(t  — t)  and  so 

00 

x{t)  = ^G{t-  c)/(t)  dx  (18) 

— 00 

We  introduce  the  spectral  densities  of  the  functions  x(t),  f(t)  and 
27 iG(t)  (the  factor  27r  is  put  here  to  simplify  subsequent  formulas) : 

x{t)  = ^X(w)  e™*  rfco, 

f{t)  = J F(<»)  do,  2-G(t)  = J L(o)  e*>‘  do 


(19) 
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all  integrals  are  taken  from  -oo  to  oo,  Quite  naturally,  we  will 
then  regard  all  the  participating  functions  of  time  as  being  finite 
as  t ± o°,  for  we  consider  the  Fourier  representation  only  for 
such  functions.  (This  finiteness  requirement  was  also  implicit  in  the 
construction  of  the  solution  (17) ; it  was  that  requirement  that  sepa- 
rated the  solution  we  needed  of  (16)  from  the  two-parameter  family 
of  all  its  solutions.)  Substituting  (18)  into  (19)  and  changing  the  order 
of  integration,  we  get 


( X(<a)  eia‘  do,  = Jg(*  - t)  do, j d~ 

— — t)  F(o>)  eia~  do,  dx 

= |f(g))  ^G(t  - t)  e'^  dr]  do, 


Putting  t — r = Ta,  we  get 


^F(co)  ^ G(tx)  ^Tl  j fa 

= ^F(co)  eiai  ^ \2tzG{t^]  e-^  dz^  d<& 
= ^F(co)  Z,(co)  e™  dc o 


Comparing  the  last  integral  with  the  original  one,  we  conclude 

that 


X(co)  = Z(<o)F(w) 


To  summarize,  then,  the  Fourier  transforms  (images)  of  the 
input  and  output  signals  are  connected  in  a very  simple  way:  the 
latter  is  obtained  from  the  former  by  multiplying  by  the  transfer 
function  L{ co).  For  example,  from  (17)  it  is  evident  that  the  transfer 
function  for  an  elementary  linear  oscillator  is  of  the  form 


L(  »)  = 


l 

m(  cog  — co2)  + ih<& 


(20) 


Conversely,  given  the  transfer  function  i(co)  and  computing 
the  Green's  function  G(t)t  we  can  pass  to  (18)  for  transformation  of 
signals.  There  arises  an  interesting  question.  Let  L( co)  be  the  transfer 
function  of  the  signal  converter.  Then  the  principle  of  causality  must 
hold  true:  it  states  that  a response  cannot  arise  before  a signal  is 
delivered.  Mathematically,  this  principle  is  equivalent  to  the  require- 
ment that  G(t  — t)  0 for  t < t,  i.e.,  G(t)  = 0 for  t < 0.  (For  this 
reason,  the  upper  limit  of  the  integral  (18)  is,  for  real  systems,  not 
equal  to  oo  but  to  t,  which  is  what  we  stated  in  Sec.  7.5.)  What 
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requirements  must  the  function  i(co)  satisfy  for  the  principle  of  cau- 
sality to  be  fulfilled  ? It  is  quite  natural  that  such  requirements  must 
be  placed  in  the  foundation  of  the  theory  even  prior  to  deriving 
appropriate  equations. 

For  the  sake  of  simplicity,  let  us  confine  ourselves  to  the  case 

where  L( co)  is  of  the  form  of  a ratio  of  polynomials:  L( co)  = — ^ - 

0(w) 

with  the  degree  of  the  denominator  exceeding  that  of  the  numerator 
at  least  by  2.  (Actually,  similar  investigations  can  be  carried  out 
for  an  appreciably  broader  class  of  functions  L( co).)  Besides,  we 
assume  that  <2(<°)  does  not  have  any  real  zeros.  Then  in  order  to 
compute  the  integral 

03 

G(t)  = — [ L(o>)  eiaido> 

2tc  j 

— 03 

for  t > 0 wTe  can  make  use  of  that  very  same  device  that  we  employed 
in  evaluating  the  integral  (36)  of  Ch.  5,  the  only  difference  being 
that  instead  of  the  complex  plane  z we  have  to  consider  the  complex 
plane  63.  Here,  as  in  Fig.  67,  the  “closing"  semicircle  is  drawn  in  the 
upper  half-plane  of  63,  since  for  t > 0 we  have  | eiCdt  | ^ 1 in  this 
half-plane  (whereas  in  the  lower  half -plane  \ei<ot\  assumes  arbi- 
trarily large  values).  On  the  basis  of  the  general  theorem  (31)  of 
Ch.  5 on  residues  we  find  that  for  t > 0 

G(t)  = i £ Resu^{L(«)  e^}  (t  > 0)  (21) 

Im  cofc>0 

In  the  right-hand  member  the  summation  is  carried  out  over 
all  singular  points  (poles)  of  the  function  L( co),  that  is,  over  the  zeros 
of  the  polynomial  @(to)  lying  in  the  upper  half -plane,  Im  63  > 0. 

The  case  t < 0 is  considered  in  similar  fashion.  Then  in  Fig.  67  we 
have  to  use  the  lower  semicircle  instead  of  the  upper  one ; this  yields 

G(t)  = -i  £ Res (/<  o)  (22) 

Im  to*<0 

From  this,  since  t is  arbitrary,  we  conclude  that  the  condition 
for  the  principle  of  causality  to  hold  (that  is,  G{t)  = 0 for  t < 0)  is 
the  absence  of  singular  points  in  the  transfer  function  L{u>)  in  the 
lower  half-plane:  Im  63  < 0.  It  is  easy  to  verify  that  the  transfer 
function  (20)  satisfies  this  condition. 

The  resulting  condition  can  also  be  interpreted  as  follows.  Since 
the  influence  function  G(t)  for  t 0 satisfies  a homogeneous  equation, 
from  formulas  (21)  and  (22)  it  is  evident  that  to  every  singular  point 
co*  = a*  + ip*  of  the  transfer  function  there  corresponds  a term 
of  the  form  eita&  = e-Weia&  in  the  solutions  of  the  homogeneous 
equation.  If  all  63*  lie  in  the  upper  half-plane,  then  all  p*  > 0,  and 
for  this  reason  all  these  summands  decay  as  t oo.  This  is  as  it 
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Fig.  195 


should  be  if  the  system  only  converts  signals  and  does  not  have  any 
energy  sources  of  its  own.  Thus,  the  physical  equivalent  of  the  con- 
dition concerning  the  location  of  singular  points  of  the  transfer 
function  consists  in  the  internal  energy  stability  of  the  system, 
that  is,  in  the  damping  of  the  energy  induced  in  it.  (Note  that  one 
could  consider  linear  systems  with  internal  energy  sources,  that  is, 
unstable  systems;  then  for  the  principle  of  causality  to  hold  we 
would  have  to  give  up  the  requirement  that  the  solutions  be  bounded 
as  t -»  oo.) 

We  can  indicate  yet  another  important  condition  tantamount 
to  the  principle  of  causality.  To  derive  it,  consider  the  integral 

I = & L(<a)  - da  (23) 

J W — Cjl)0 

{L) 

extended  around  the  contour  in  Fig.  195,  where  <o0  is  a fixed  real 
value  of  oo.  Fulfilment  of  the  principle  of  causality  is  equivalent 
to  the  absence  of  singular  points  inside  (L)  and  for  this  reason,  due  to 
the  arbitrary  nature  of  o)0,  to  the  condition  1 = 0.  Now  let  R ->  oo 
and  let  the  radius  of  the  small  circle  e ->  0.  Then,  reasoning  as 
in  Sec.  5.9,  we  find  that  the  integral  around  the  large  circle  tends 
to  zero.  The  integral  over  the  horizontal  segments  tends  to 

-pC..X-M.  do  (24) 

J co  — co0 


where  P stands  for  Gauchy’s  principal  value.  By  definition  we  have 


00 


— CO 


(This  complication  is  due  to  the  divergence  of  the  integral  (24)  for 
<o  = <o0  in  the  ordinary  sense  of  Sec.  3.1.)  Finally,  the  integral  around 
the  small  semicircle  tends  to 


IW(- — = -«XK) 

J co  — co0 
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When  passing  to  the  limit  in  (23)  as  R ->  oo,  e ->  0,  we  get 

oo 

— P ( — ■■  da  — -iL(a0)  = 0,  or 
J <*>  - co0 

— 00 

IK)  = - P ? 

TT  J tO  — C00 

— oo 

Representing  L(co)  as  Re  L{ co)  + i Im  Z,(co)  and  separating  the  real 
part  from  the  imaginary  part  in  the  last  equation,  we  get  the  so-called 
dispersion  relations  for  the  transfer  function: 

Re  IK)  = - - P ( ImLN^, 

TC  J (0  - (O0 

— 00 

Im  IK)  = - P { -Re  L(0i)  da 

7 Z J CO  - t00 

— 00 

which,  as  we  see,  are  equivalent  to  the  principle  of  causality.  These 
relations  play  an  important  part  in  quantum  mechanics.  They  permit 
obtaining  important  propositions  even  for  systems  for  which  no 
appropriate  theory  has  yet  been  developed  and  so  the  type  of  equa- 
tions is  not  known. 

The  principle  of  causality  is  the  underlying  mechanism  of  an 
unworkable  device  mentioned  in  an  article  by  R.  Hagedorn  [7]  for 
taking  photographs  in  complete  darkness  prior  to  switching  on  the 
light.  Suppose  a flash  of  light  occurs  in  complete  darkness.  It  is 
described  by  a delta  function  whose  spectral  representation  contains 
harmonics  of  the  form  e1coi  that  are  present  at  all  times.  It  is  further 
suggested  a filter  be  set  up  to  screen  all  harmonics  except  one  specific 
one,  with  the  result  that  we  get  monochromatic  light  that  acts  not 
only  after  but  even  prior  to  the  light  flash!  But  this  is  absurd,  for 
the  impossibility  of  such  filtering  follows  from  the  principle  of  cau- 
sality. We  can  also  use  dispersion  relations  to  show  that  a filter  that 
passes  only  one  frequency  is  impossible.  The  law  of  transmission  of 
any  filter  is  such  that  the  sum  of  the  harmonics  from  any  flash  of 
light  must  be  identically  equal  to  zero  prior  to  the  flash. 

14.4  Properties  of  the  Fourier  transformation 

We  will  call  the  function  f(t)  the  original  {preimage)  and  the 
corresponding  function  F{u>)  the  image  {Fourier  transform)  of  /(/) 
under  the  Fourier  transformation.  The  Fourier  transformation  itself 
(otherwise  called  the  Fourier  operator , cf.  Sec.  6.2)  is  defined  by  for- 
mula (10)  and  we  use  it  to  pass  from  the  original  to  the  image.  For- 
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mula  (4),  where  we  passed  from  the  image  to  the  original,  defines 
the  inverse  Fourier  transformation . 

Let  us  consider  some  properties  of  the  Fourier  transformation. 
First  of  all,  it  is  linear,  which  means  that  when  the  originals  are  added, 
so  also  are  the  images,  when  an  original  is  multiplied  by  a constant, 
the  image  is  multiplied  by  that  constant  too.  This  property  follows 
immediately  from  similar  properties  of  the  integral  and  has  already 
been  used  by  us  in  Sec.  14.2  where  we  made  use  of  the  principle 
of  superposition  in  deriving  (10).  (Physically,  this  follows  from  the 
linearity  of  the  oscillators  under  consideration.)  Naturally,  the  inverse 
transformation  is  also  linear. 

When  the  original  is  shifted  by  a constant  a,  its  image  is  multi- 
plied by  e~ia<*.  True  enough,  since  after  a shift  we  get  the  function 
f(t  — a)  and  its  spectral  density  is 

oo  co 

— [ f{t  — a)  e~ioit  dt  = — f /( t)  r‘“(T+a)  d-. 

2k  j 2k  J 

— 00  —00 


00 


(we  made  the  substitution  t — a = t). 

This  property  is  particularly  evident  in  the  case  of  separate  har- 
monics: if  we  shift  the  harmonic  eioiot  by  a,  we  get  £»«•(*-«)  = 
= e-iaiooei<*»tf  that  is,  it  is  multiplied  by  e~ia^  and  so  its  image, 
8(a)  — co0),  is  multiplied  by  or,  what  is  the  same  thing  for 

this  function,  by  e~iaa.  And  since  all  these  images  receive  the  same 
factor,  their  sum,  or  jF(co),  has  that  factor  too. 

In  similar  fashion  it  can  be  verified  that  if  the  image  is  shifted 
by  an  amount  a , the  original  is  multiplied  by  eiat.  It  is  precisely 
this  property  of  invariance,  to  within  a multiplying  constant,  of  the 
function  eioit  relative  to  shifts,  together  with  the  possibility  of 
superposition,  that  makes  this  function  play  such  a role  in  the  Fourier 
transformation.  It  is  quite  natural  that  when  studying  processes 
that  obey  linear  equations  that  are  homogeneous  in  time,  we  have 
to  rely  on  functions  which  themselves  admit  a time  shift.  * 

We  will  consider  in  more  detail  the  transformations  of  a family 
of  solutions  of  the  problem  just  posed.  The  term  transformation  group 


Here  the  equations  themselves  may  not  be  only  differential  equations.  A typical 
integral  equation  homogeneous  in  time  is 
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is  used  for  a set  of  transformations  having  two  properties : if  together 
with  any  two  transformations  it  contains  the  transformation  resulting 
from  their  successive  application  and  if  every  transformation  has  an 
inverse.  For  example,  a transformation  group  is  formed  by  any  collec- 
tion of  multiplications  by  nonzero  constants  (this  is  necessary  for 
the  existence  of  the  inverse  transformation).  Under  this  transforma- 
tion, every  function  f(t)  goes  into  Cf(t).  Another  important  instance 
is  the  group  of  shifts  under  which  every  function  f(t)  goes  into  f(t  + t) 
(the  constant  t < 0 determines  the  shift).  If  a study  is  being  made  of 
a linear  problem  that  is  homogeneous  in  time,  then  its  general  solu- 
tion, (i.e.,  the  family  of  all  solutions),  must  be  invariant  with  respect 
to  both  these  transformation  groups.  Proof  is  given  in  linear  algebra 
that  this  general  solution,  which,  generally,  includes  many  para- 
meters, may  be  represented  in  the  form  of  a sum  of  one-para- 
meter families  of  solutions,  these  families  also  being  invariant 
under  the  indicated  transformation  groups  (exceptions  will  be 
given  below). 

However,  it  is  easy  to  verify  that  a one-parameter  family  of 
functions  possessing  such  invariance  must  necessarily  be  of  the  form 
Cept  ( C is  arbitrary  and  serves  as  a parameter  inside  the  family,  p 
is  fixed  and  serves  as  a parameter  of  the  family  itself).  Indeed, 
since  the  family  is  a one-parameter  family  and  is  invariant  under 
multiplication  by  constants,  it  is  exhausted  by  functions  of  the 
form  Cf(t),  where  C is  arbitrary  and  f(t)  is  some  function  of  the  family. 
But  from  the  invariance  with  respect  to  shifts  it  follows  that 
f{t  + t)  = CTf(t),  whence 

/■(()  = lim  x>  + o-m  = Um  BAzM) = rBm 

T->0  T T->0  T L T-*0  T J 

Denoting  the  limit  in  the  square  brackets  by  p,  we  arrive  at  the  dif- 
ferential equation  f'(t)  = pf(t),  whence  f(t)  = ept  to  within  a multi- 
plying constant. 

From  this  we  find  that  if  a family  of  functions  is  invariant  with 
respect  to  both  indicated  transformation  groups  and  has,  say,  two 
degrees  of  freedom,  then  it  is  of  the  form  CxepJ  + Czep*9  where  C1 
and  C2  are  arbitrary  constants.  However,  if  the  problem  at  hand 
contains  parameters  and  for  some  values  of  these  parameters  we 
get  px  = p2,  then  the  aspect  of  the  two-parameter  family  of  solu- 
tions changes.  In  that  case  it  may  be  shown,  as  in  Sec.  7.3, 
that  the  indicated  family  takes  on  the  aspect  Cxep^  + C2tepit. 
(Here,  the  one-parameter  family  C2tepx*  is  not  by  itself  invariant 
to  shifts.)  If  three  values  coincide,  px  = p2  = pz>  then  the  three-para- 
meter family  of  solutions  becomes  Cxep^  + C2tep J + Czt2ep^t  etc. 

From  the  properties  that  have  been  demonstrated  it  follows  that 
the  Fourier  transform  (image)  is  multiplied  by  ?’ca  when  differentiating 
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the  function  f(t).  Indeed,  for  the  image  of  the  function  ^ + ^ — 

we  have 

i e*»F(«>)  - -F(«)  = — -F(«a)  = IVw  +^-h  -f 

h h h L 2 J 

Passing  to  the  limit  as  h 0,  we  get  our  assertion.  It  can  be  verified 
in  similar  fashion  that  when  differentiating  the  image,  the  preimage 
is  multiplied  by  — it . These  properties  can  be  seen  directly  if  we  form 
the  derivative  of  the  integral  with  respect  to  the  parameter, 

- ( F(<o)  e^'do)  = ( moF(«)  do, 

dt  dt  J J 

and  also  for  the  inverse  transformation 

F(o,)  = i-  [f(t)  e-'“‘  dt,  = - ± C //(<)  dt 
2tt  J aco  2tt  J 

The  foregoing  properties  of  the  Fourier  transformation  find 
remarkable  application  to  the  solution  of  linear  differential  equa- 
tions with  constant  coefficients.  We  made  use  of  them  in  Sec.  14.2 
when  we  considered  the  problem  of  forced  oscillations  of  an  oscillator 
with  friction. 

It  is  interesting  to  consider  the  equation  of  free  oscillations,  that 
is,  equation  (16)  for  f(t)  = 0.  Performing  the  Fourier  transformation, 
we  get 

( — mco2  + ho)i  + mcog)  -X*(<o)  = 0 (25) 

whence  X(co)  = 0 and  x(t)  = 0.  We  only  get  the  trivial  (zero)  solution. 
The  solutions  found  in  Sec.  7.3  were  unbounded  (exponentially  large) 
as  t oo  and  therefore  they  are  not  obtainable  via  the  Fourier 
transformation.  That  is  why,  in  Sec.  14.2,  in  solving  equation  (16) 
we  only  got  one  solution  which  did  not  include  arbitrary  constants. 
Now  suppose  there  is  no  friction,  i.e.  h = 0.  Then  in  place  of  (25) 
we  get 

(— mco2  + mco  5)  X(co)  = 0 or  (co2  — cog)  X(co)  — 0 (26) 

It  might  appear  that  from  this  it  also  follows  that  -X'(co)  = 0.  But 
actually  (26)  is  satisfied  by  the  function 

X(u>)  = CiS(<o  — co0)  -j-  C2X(co  -f-  co0)  (27) 

where  C±  and  C2  are  arbitrary  constants.  Indeed  (see  Sec.  6.1), 

(o2  — cog)  [C^co  — co0)  + C28(co  + 6)0)] 

= C^co2  — cog)  S( co  — co0)  + C2(o2  — cog)  S(co  + co0) 

= Cx( cog  — cog)  $(co  — co0)  + C2[(— co0)2  — cog]  S(co  + co0)  = 0 


Sec.  14 .4  Properties  of  Fourier  transformation  593 


Fig.  196 


From  (27),  by  the  inversion  formula  (4),  we  get 

x{t)  — ^ [Cj^co  — co0)  -f-  C2S(<*>  + co0)]  e10it  d(&  = -j-  C2e~i0i*t 

We  arrive  at  a solution  that  we  know  from  Sec.  7.3. 

Now  let  us  consider  the  Fourier  transforms  (images)  of  certain 
functions  mentioned  in  Sec.  6.3.  Here,  the  Fourier  operator  will 
be  denoted  by  SF.  Formula  (9)  signifies  that 

*[*(<)]  = i-  (28) 

It  is  easy  to  verify  that,  conversely  SF[1]  = 8(<o). 

In  order  to  compute  & [sgn  /]  (see  Sec.  6.3),  first  replace  the 
limits  of  integration  in  the  right  member  of  (10)  by  ± N,  where  N 
is  great,  to  get 


sgn  t • e~ 


-if-j 

l-  -N 


e-iui  dt  + \ e~iUit  dt  = 


1 — cos  <x>N 


The  graph  of  this  function  multiplied  by  i is  shown  in  Fig.  196; 
for  large  N it  oscillates  frequently  about  the  graph  — • Therefore, 

7TCO 

on  the  basis  of  the  same  reasoning  as  in  Sec.  14.2,  it  is  natural  to 
assume  that 

SF[sgn/]=—  (29) 


GO 

f - A - ei<*t  = sgn  t 
J TT  KO 


38-1634 
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True  enough,  for  if  we  regard  the  integral  in  the  sense  of  the  principal 
value,  then  we  can  see  that  the  last  formula  holds  true,  just  as  in 
Sec.  H.3. 

From  (29)  there  follows  the  Fourier  transform  of  the  unit 
function : 

* [e(t)l  = SF[{  + {sgn*]  = i 8(«)  + -JL  (30) 

This  formula  is  in  agreement  with  (28),  since  it  follows  from  (30) 
that 

W)]  = W<)]  = *«W)]  = ^ 


Of  the  other  properties  of  the  Fourier  transformation,  let  us 
examine  the  asymptotic  behaviour  of  the  Fourier  transform  as 
For  the  sake  of  simplicity  we  assume  that  the  function  f(t)  is  nonzero 
only  on  the  finite  interval  t or  at  least  tends  to  zero  as  + 00  to- 
gether with  its  derivatives.  Then  one  should  not  think  that  F(co) 
too  must  necessarily  have  the  same  properties  as  co“>=]roo:  the 
asymptotic  behaviour  of  F(c 0)  depends  on  something  quite  different. 
From  (10)  it  follows  immediately  that  if  f(t)  = &(t  — t0),  then 

Fco  — — , whence  F( co)  as  co  ->^00  remains  bounded  but 

2k 

does  not  tend  to  zero.  It  can  be  shown  that  this  will  always  happen 
if  f(t)  contains  delta  terms. 

Now  suppose  f(t)  does  not  have  any  delta  terms,  but  has  so-called 
discontinuities  of  the  first  kind , that  is,  points  t0  for  which  the  limits 
f(t0  — 0)  and  f(t0  + 0)  are  finite  but  are  not  equal.  (A  typical  func- 
tion with  a discontinuity  of  the  first  kind  is  the  unit  function  shown 
in  Fig.  72.)  Then,  as  was  shown  in  Sec.  6.3 ,f'(t)  will  have  delta  terms. 
But  by  virtue  of  page  592,  the  function  f'(t)  has  a spectral  density 
«ixF(co)  and  so,  by  the  foregoing,  we  get  that  iu>F( co)  is  bounded  as 
w~>±oo  but  it  does  not  tend  to  zero.  Hence,  under  these  assumption 

F( co)  tends  to  zero  at  the  rate  of  — as  ^->±00. 


If  the  function  itself,  f(t),  is  continuous  but  its  derivative  has 
discontinuities  of  the  first  kind  (this  means  that  the  graph  of  f(t) 
has  corners,  as  in  Fig.  76),  then  in  similar  fashion  we  find  that  F(co) 

is  of  the  order  of  — as  co  -►  ±00,  and  so  forth.  Finally,  if  deriva- 

G)2 


tives  of  f{t)  of  any  order  are  continuous,  then  F( co),  as  0 -*  ±00, 
tends  to  zero  faster  than  any  negative  power  of  | co  | ; such,  for 


example,  will  be  the  case  for  f(t)  = ^ + - (see  formula  (40)  of  Ch.  5 

in  the  right  member  of  which  we  have  substitute  |co|  for  co  when  co  < 0) , 
e (see  Sec.  14.5),  and  so  on. 
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For  large  to,  the  Fourier  integral  is  an  integral  of  a rapidly  oscillat- 
ing function.  Such  integrals  were  considered  in  Sec.  3.4,  and  now 
we  can  better  understand  why  in  the  asymptotic  formulas  there  the 
values  of  the  integrand  and  its  derivatives  participated  only  at  the 
endpoints  of  the  interval  of  integration.  Indeed,  there  we  considered 
integrals  of  the  type 

b 

^f(t)  eiat  dt 

a 

as  to  ->4;  oo.  But  such  an  integral  can  be  rewritten  as 

oo 

^ f{t)  eiat  dt 
— 00 

where  the  function  f(t)  is  obtained  from  f(t)  by  continuing  f{t)  outside 
the  interval  a ^ t ^ b by  identical  zero.  Thus  we  obtain  (up  to  the 

inessential  sign  of  and  the  coefficient  — ) the  Fourier  transform 

2tz 

(image)  of  the  function  f(t).  But  this  function  and  its  derivatives 
have  discontinuities  at  t = a and  t = b,  where  fit)  vanishes  identically. 
By  virtue  of  what  has  been  said,  these  singularities  are  what 
determine  the  coefficients  in  the  asymptotic  formulas  if  the  function 
f(t)  itself  is  sufficiently  smooth.  It  is  also  clear  that  if  f(t)  and  its 
derivatives  have  discontinuities  between  a and  &,  then  these  discon- 
tinuities too  will  have  their  effect  on  the  asymptotic  formulas,  as 
was  mentioned  in  Sec.  3.4, 

On  the  other  hand,  the  procedure  of  integration  by  parts  — Sec. 
3.4  was  based  on  this  — is  useful  in  the  theory  of  Fourier  transfor- 
mation since  it  permits  not  only  determining  the  order  of  the  Fourier 
transform  as  ^->±00,  but  also  obtaining  the  appropriate  asymptotic 
formulas. 

Let  us  consider,  for  example,  the  Fourier  transform  of  the  func- 
tion fit)  = — - — : 

1 + 1*1 

00 

F(w)  = _L  C — 1 — e-Mda> 

v 2tt  J 1 + MI 


Here,  f(t)  has  a singularity  (corner)  at  t = 0.  Integration  by  parts 
gives 


1 

i + MI 


— 00 


—1 

sg 

(1  + 1 t\f 
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where  sgn  t — = — • 

dt  t 


Hence 


F(<o)  = — - ( 

’ 2n  ta  J 


(1  + Ml)2 


Here  the  integrand  already  has  a jump  at  t = 0 equal  to  2 and  so  a 
second  integration  by  parts  yields  a delta  term,  which  generates  the 
principal  term  of  the  asymptotic  formula: 


-{-  l—  [28(0 

CO  [ J — 2CO  L 


— 2 

a + mi)3 


i 


7C 


co 

- S — — 

2w*  J (1  + |(|)s 

— CO 


e~iat  dt 


] 


(31) 


Since  the  last  integral  tends  to  zero  as  co  ->  +00,  we  get  the  asymptotic 
expression 

(32) 


which  is  more  exact  than  the  general  assertion  that  F(co)  must  be 
of  the  order  of  — as  co  -++00. 

CO2 

Expression  (32)  may  be  refined  by  a second  integration  by 
parts  of  the  last  integral  in  the  right  member  of  (31).  The  computa- 
tion, which  we  leave  to  the  reader,  yields 


1 


(1  + Ml)5 


]dt\ 


which  leads  to  the  asymptotic  expression 

F{  co) 

7T  [ CO2  CO4  j 


that  is  more  exact  than  (32).  This  process  can  be  continued. 

Let  us  take  a look  at  another  question.  In  Sec.  14.5  it  was  shown 

that  the  function  F(co)  = — serves  as  the  Fourier  transform 

2y  n 

of  the  function  f(t)  = e~*2.  Here,  f(t)  and  its  derivatives  do  not 
have  any  singularities  at  all  and  that  is  why  it  turned  out  thatF(co) 
tends  to  zero  faster  than  any  negative  power  of  | co  | as  co  ->  ±00.  At 
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Fig.  197 


the  same  time,  by  virtue  of  the  foregoing,  for  an  arbitrarily  large 
fixed  N the  function 


N 

-N 

that  approximates  F(o>)  = is  of  the  order  of  |<o|_1  as 

— >•  dr 

How  do  we  fit  these  facts? 

The  point  is  that  for  large  N the  coefficients  of  the  expansion 
of  the  function  FN(<x>)  in  negative  powers  of  a>  becomes  extremely 
small  so  that  terms  involving  o'1,  co-2,  and  so  on  disappear  in  the 
limit.  This  expansion  is  easily  obtained  by  integration  by  parts  and 
begins  with  the  terms 

t-  / \ 1 at*  sin  No  2 ,r  cos  ATco  . 

F*(o>) Ne~Ni — + ... 

TC  CO  77  W* 

For  example,  already  for  N = 5 the  coefficient  of  co"1  is  of  the 
order  of  10-12,  which  is  extremely  small.  The  graphs  of  the  func- 
tions F( co)  and  FN( co)  are  shown  schematically  in  Fig.  197.  The  differ- 
ence in  the  nature  of  the  “wings”  for  large  N is  of  hardly  any  sig- 
nificance since  the  values  themselves  are  very  small. 

Exercises 

1.  Prove  that  if  F( co)  is  the  image  of  f(t),  then  — F (— ) serves 

I <*l  V a I 

as  the  image  of  the  function  f(at)  ( a = constant). 

2.  Use  Exercise  1 of  Sec.  14.2  to  find  the  Fourier  transforms  of 

the  functions  “I*— *•!  , if (a  > 0). 

Ml 

14.5  Bell-shaped  transformation  and  the  uncertainty  principle 

An  interesting  example  of  the  Fourier  transformation  is  given 
by  the  function 

f(t)  = Ce~atl  ( C , a = constant  > 0)  (33) 
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(its  graph  is  bell-shaped,  see  Fig.  187).  The  formula  (10)  gives  us 

OO  CO 

F( co)  — ^ ^ Ce-at2e~ioit  dt  = ^ e~ai2-ioii  dt 

— 00  —00 

Transforming  the  exponent  by  the  formula 

—at2  — iu>t  = — a^t2  + — /j  = —a^t  + j2  — 
and  setting  ]f  a^t  + j = we  get 


C r -*\l+~2a)  —4 -a 


dt 


— 
4 a 


2i zYa 


'J* 

iL) 


' dz 


(34) 


where  (L)  is  a straight  line  in  the  complex  plane  2 parallel  to  the  real 
axis  and  passing  through  the  point  z 


2\  a 


■ yt. 


We  will  now  show  that  the  integral 

00 

^ e~z 2 dz  = ^ e-(x+iy)2 dx  = Iy 


(35) 


actually  does  not  depend  on  y.  To  do  this,  differentiate  Iy  with  res- 
pect to  the  parameter  y (see  Sec.  3.6)  to  get 

dJl=\  — e~[x+iy)i  dx  = — 2i  [ e~{x+iy)\x  + iy)  dx 
dy  J dy  J 


But  the  last  integral  can  be  evaluated: 


dIv  __  ie-(x+*y)2 
dy 


= ie~xZ+v\-2ixy 


= 0 


(the  function  obtained  after  integration  vanishes  at  both  limits). 
Thus,  Iy  = constant  = Iy  |y=o-  Setting  y = 0 in  (35),  we  arrive  at 
the  familiar  integral  (Sec.  4.7) 


^ e~zX  dz  = ^ e~xi  dx  = )/n 


Substituting  the  value  found  into  (34),  we  get  the  spectral 
density  of  the  bell-shaped  transformation  (33): 

G>* 


F(g>)  =— 

V ’ 2 tfna 


4 a 


(36) 


We  see  that  the  relationship  of  density  and  co  is  similar,  bell-shaped, 
but  with  different  coefficients. 
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An  important  consequence  follows  from  the  formulas  (33)  and 
(36).  A bell-shaped  function  does  not  completely  vanish  for  any  value 
of  the  variable.  Still,  we  can  reasonably  determine  the  width  of  the 
bell  (cf.  Sec.  3.2).  For  example,  if  for  the  width  we  take  that  portion 
over  which  the  height  diminishes  e times  from  the  maximum  value, 
then  the  width  of  the  bell  (33)  is  equal  to  At  — 2j]fa,  whereas  the  width 
of  the  transformed  bell  (36)  is  equal  to  Aco  = 4 ^ a.  Thus,  if  we 
vary  ay  then  one  of  the  bells  becomes  narrower  and  the  other  just  as 
many  times  wider  so  that 

At  * Aco  = 8 = constant  (37) 

It  can  be  shown  that  an  analogous  rule  holds  for  any  shape  of  the 
functions  being  transformed.  This  result  is  of  fundamental  importance. 
Suppose,  as  in  Sec.  14.2,  we  are  considering  the  action  of  a force  f(t ) 
on  a system  of  oscillators  with  distinct  natural  frequencies.  We  see  that 
the  more  localized  (compressed)  the  outer  force  is  in  time,  the  more 
“spread”  is  its  spectrum ; that  is,  the  greater  the  number  of  oscillators 
with  distinct  frequencies  that  this  force  will  excite  with  roughly  the 
same  amplitude.  Conversely,  by  increasing  the  selectivity,  i.e.,  compres- 
sing the  spectrum,  we  are  forced  to  spread  out  the  external  action  in 
time.  This  impossibility  of  simultaneously  localizing  an  external  action 
in  time  and  enhancing  the  selectivity  of  that  action  is  one  of  the  mani- 
festations of  the  so-called  uncertainty  principle , which  plays  a funda- 
mental role  in  modern  physics. 

From  the  uncertainty  principle  it  follows,  for  one  thing,  that  if 
a certain  system  is  in  oscillation  with  a variable  frequency,  then  it  is 
meaningless  to  speak  of  the  value  of  the  frequency  at  a given  time: 
for  instance,  in  acoustics  one  cannot  speak  of  the  exact  pitch  of  sound 
at  a given  instant  of  time,  and  so  forth.  The  time  interval  over  which 
this  frequency  is  determined  cannot  be  taken  substantially  less  than 
the  oscillation  period ; sound  of  a definite  pitch  cannot  last  too  short 
a time! 

In  quantum  mechanics,  the  energy  of  a particle  is  connected  with 
the  frequency  of  the  wave  function.  The  wave  function  of  a particle 
with  a definite  energy  E is  proportional  to  e~iEtlTlf  where  fi  = 1.05  X 
X 10~27  erg-s  is  Planck's  constant  (Planck  introduced  the  quantity 
h — 6.62*  10-27  erg-s;  it  is  more  convenient  to  write  formulas  with 
ft  = kj 2n).  If  a particle  is  observed  during  a short  time  interval  At, 
then,  as  follows  from  the  foregoing,  the  frequency  of  its  wave  function, 
i.e.,  the  quantity  Zt/fi,  can  only  be  found  with  a low  accuracy.  We  get 
the  relation:  A E • At  is  of  the  order  of  fi . Similarly,  the  wave  function 
of  a particle  with  a definite  momentum  along  the  #-axis,  p — mvx, 
is  proportional  to  e{^h.  If  it  is  known  that  the  particle  resides  in  a 
definite  interval  between  x and  x + Axt  then  the  wave  function  of 
the  particle  is  nonzero  only  within  this  interval.  Expanding  the  wave 
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I 

At 

Fig.  198 


function  in  a Fourier  integral,  we  conclude  that  the  momentum  of 
the  particle  is  known  only  with  a definite  accuracy  A p ; the  Fourier 
integral  will  contain  large  terms  involving  efrxln,  where  px  < p < 
< px  -f-  A p.  From  this  it  follows  that  A p • Ax  is  of  the  order  of  ti. 

From  (33)  and  (36),  on  the  basis  of  the  properties  of  the  Fourier 
transformation  described  in  Sec.  14A,  it  follows  that 

( 6) 2 

if  f(t)  = , then  2?(m)  = — %=■  e 40 

2Y  Tza 

The  real  part  of  f(t)  and  the  function  F(u>)  are  shown  in  Fig.  198.  The 
width  At  of  the  “wave  packet”  (which  is  defined  in  similar  fashion  to 
the  function  (33))  and  the  width  of  the  corresponding  spectrum 
are  connected  by  the  uncertainty  relation  (37),  irrespective  of  the 
frequency  g)0  of  oscillations.  It  is  on  this  frequency  that  the  shift 
of  the  spectrum  along  the  axis  of  frequencies  depends. 

Note  the  remarkable  application  of  the  bell-shaped  transformation 
to  the  derivation  of  the  normal  law  of  probability  theory  (Sec.  13.8). 
Let  the  random  variables  52,  ...,  i;nbe  independent  but  distributed 
by  the  same  law  of  density  distribution  y(x) ; for  the  sake  of  simplicity 
we  assume  that  the  mean  value  = 0.  Consider  the  random  variable 
y\k  = e~icih  where  co  is  a specified  real  constant.  Since  r\k  takes  on 

the  values  — e~i(0X  with  probability  density  <p(#),  its  mean  value  is 

2tt 

00 

*)*  = ^ $ 6~iax  ?(*)  dx  = 0(o>) 

— 00 
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where  5 = Si  + £2  + •••  + ln>  whence 

(2  n)'-'r,=±e-«* 

That  is,  by  virtue  of  the  preceding  section  the  mean  value  of 
(27r)n-17)  serves  as  the  Fourier  transform  of  the  distribution  function 
yn(x)  of  the  random  variable 

In  Sec.  13.7  we  saw  that  when  independent  random  variables  are 
multiplied,  so  also  are  their  mean  values. 

Hence, 

(Z^TS)  = W1™*  ...  ru  = (27c)-«[fl»(a>)r 

It  is  easy  to  verify  that  the  function  O(co)  for  co  = 0 has  a max- 
imum. Indeed, 

CO 

4>(0)  = ~ <f{x)  dx=~ 

— 00 

whereas  for  other  co, 


— 00 


But  on  pages  81-82  we  saw  that  in  this  case,  for  large  n , the  function 
[O(co)]rt  may  be  replaced  approximately  by  Ae-COi2 , where  A =en^°\ 

c — — <]/(co)  = lnO(co). 

In  our  example  we  get 

0(0)=^.  4>'(0)  = 0, 

2tc 

00 

O''(0)  = - 2.  J X2<p(x)  dx  = 

— 00 

(A2  is  the  variance  of  the  variable  see  Sec.  13.8),  whence 

<J*(0)  = In  — r{0)  = = _ A2 

YW  2k  w <&2(0) 

and  so 

wind-  — I «A2Cl>-  i — 

2n*  2 =±e  2 


(27r)n"1[0(co)]n  » (271)^ 


e 


602 


Fourier  transformation 


CH.  14 


B}-  the  foregoing  we  have  obtained  the  Fourier  transform  of  the 
function  <pn(x).  As  we  see,  it  is  bell-shaped.  By  formulas  (33)  and 

(36) (in  which  — = — » — = , whence  a = — — , C = _L  ] 

\ 2\f  na  2tz  Aa  2 2wA2  \ 2izn  A; 

we  get  the  distribution  function  of  the  random  variable 

?.(*)  =TW=e~*i,2n* 

A V 

We  have  arrived  at  the  normal  law  (equation  (25)  of  Ch.  13). 

Exercises 


1.  Prove  the  uncertainty  principle  for  a function  f(t)  that  is  constant 
and  nonzero  only  on  a finite  interval. 

2.  Establish  a relationship  between  the  uncertainty  principle  and  the 
result  of  Exercise  1 of  Sec.  14.4. 


14.6  Harmonic  analysis  of  a periodic  function 

Suppose  an  external  action  f(t)  is  a periodic  function  with  a 
certain  period  T > 0,  that  is, 

M+T)mM  (38) 

Then  only  harmonics  ei<iA  having  the  same  property  (38)  will  take 
part  in  the  expansion  (4)  into  harmonics.  From  this  we  get 

eiu>{t+T)  = ei<*t  that  is,  ei(cT  = 1,  iu>T  = 2kni. 

(see  Sec.  5.4)  and  we  see  that  the  frequency  to  can  take  on  only  the 
values 

<0  = <0*  = (*  = -■  ~2’  - 1‘  °’  2 -)  (39) 

Thus,  a period  function  has  a discrete  spectrum  and  for  this  reason 
a delta-like  spectral  density 

T1) 

Substituting  into  (4),  we  get 

oo  2kni, 

f(t)  =T,  a,eT  (40) 

— 00 

This  is  the  so-called  Fourier  series  into  which  the  periodic  function 
is  expanded. 

To  compute  the  coefficients  of  (40),  take  the  mean  values  of  both 
sides  of  (40).  By  virtue  of  the  periodicity  of  f(t)  the  mean  value  of  the 
left-hand  side  is 

T 

o 
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In  the  right-hand  member,  the  mean  value  of  each  harmonic  is  equal 
to  zero  since  for  <o  =£  0 as  well 

N 


1 eioit 

e dt  = — — 
2 N ) 2 N ito 

-N 


-i  ( 

2 N J 


N 

t——N 


sin  o )N 


0 


However,  one  term  on  the  right  of  (40)  is,  for  k — 0,  simply  equal  to  the 
constant  a0 , and  its  mean  value  is  of  course  a0.  For  this  reason,  from 
(40)  we  get 


f = ^\mdt  = a0 


2kni  J 


To  find  the  coefficient  ak , multiply  both  sides  of  (40)  by  e T 
(for  a certain  fixed  k)  and  then  take  the  mean  values  of  both  sides. 
Arguing  as  in  the  preceding  paragraph,  we  get 

T 


2kni 

ak=f{t)e  T 


2kni  , 


dt 


(41) 


If  the  function  f{t)  is  real,  then  we  often  use  a different,  real, 
form  of  the  Fourier  series.  For  this  purpose,  in  the  expansion  (40) 
we  combine  the  symmetric  terms 


2kjzi  J 


+ «_** 

2kn 


2kni 

~T 


') 


cos  - — t i(ak  — ak)  sin  1 


(42) 


'w— +BU 

= U0  + + a~k) 

= *o  +§(^cos^1^  + S*sin^^J 
Here,  the  coefficients  are,  by  virtue  of  (41), 

T 

ao  = — ^ f{t)  dt , 
o 

T r 2kni  t T 

A^ak  + a_k  = ^m{i~  +eT  )dt  = ^f(t)  cos^-t 

0 0 

{k  = 1,  2, ...), 

t ( 2kni t\  T 

Bk  = i{ak-a_k)=^f{t)[i  T -eT  )dt  = ^f{t)sm^tdt 

0 0 

(k  = 1,  2,  ...) 

All  quantities  in  the  right  members  of  (42)  are  real. 


dt 
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Fig.  199 
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To  illustrate,  take  the  series  expansion  of  (42).  Consider  the “peri- 

— M 

odic  step”  shown  in  Fig.  199  by  the  heavy  lines.  Here  a0  —f  = — • 
By  formulas  (43) 


2 M T . 2kiz 


Ak  = — C M cos 1 dt  = sin  ■ 

T J T T 2kiz 

t 

2 


= o. 


Bj.  = — V M sin  1 dt  = 


T_ 

T 


2 M T ( 2kK  jr  \ 

T 2k7z  \ T KJ 


TVf 


= - — (1  - COS^Tt) 

Att  ' 

From  this, 

r?  D ^Af  73  ~ 73  2Af 

t>i  — i ^ — U,  Z53  — — , z>4  — U,  z?5  — — > 

1 7T  37T  07C 


and  we  get  the  expansion 

r/Jv  Af  2M  ( 1 . 2^  1 . 6t t*  1 . 10tt/ 

m = — sin — sin — sin 

J ' 2 Ttil  T 3 T 5 T 


The  second  partial  sum  S2(t)  and  the  third  partial  sum  S3(J)  of  this  series 
are  shown  in  Fig.  199  by  the  dashed  lines  (for  the  sake  of  simplicity 
they  are  shown  only  over  a period,  they  continue  periodically  in  both 
directions).  It  is  interesting  to  note  that  at  the  points  of  discontinuity 
the  sum  of  the  series  is  equal  to  the  arithmetic  mean  (the  half-sum, 
that  is)  of  the  limiting  values  of  the  function  f(t)  on  the  left  and  on  the 
right.* 

Since  to  the  oscillation  period  T there  corresponds  the  frequency 
= y = 6ix  (see  (39)) , we  see  from  (39)  and  (40)  (or  (42))  that  the  gener- 
al nonharmonic  oscillation  with  frequency  co  = results  from  the 


Note  that  the  partial  sums  Sn(0  on  the  function  f(t)  (Fig.  199)  appreciably 
overshoot  the  step  boundaries.  This  common  property  of  Sn(t)  is  termed  the 
Gibbs  phenomenon. 
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combination  of  harmonic  oscillations  with  frequencies  co3,  co2  = 2co2, 
co 3 — 3coj,  and  so  on.  The  first  frequency,  co3,  which  is  the  frequency 
of  the  oscillation  itself,  is  called  the  fundamental  frequency  and  the 
corresponding  harmonic  is  also  termed  the  fundamental  harmonic,  the 
other  frequencies  and  harmonics  are  called  higher . 

These  results  are  very  important,  for  example,  in  acoustics.  We 
are  familiar  with  the  fact  that  the  pitch  of  a sound  corresponds  to 
the  frequency  of  vibrations  of  the  air:  for  example,  the  sound  “la” 
(or  A)  of  the  first  octave  has  a frequency  of  440  hertz  (440  vibrations 
per  second).  But  we  also  know  that  one  and  the  same  note  sounded 
on  different  instruments  exhibits  different  timbre  (distinct  tones). 
What  does  the  timbre  depend  on  ? It  turns  out  that  the  timbre  is  due 
to  the  higher  harmonics  superimposed  on  the  fundamental  harmonic. 
Each  of  them  enters  into  the  oscillation  with  a specific  amplitude,  and 
if  we  vary  proportions  between  the  amplitudes,  this  will  change  the 
timbre. 

Another  corollary  of  the  results  obtained  is  this.  We  know  that 
if  an  oscillator  with  a natural  frequency  of  co0  is  excited  by  an  external 
force  with  frequency  co,  then  for  co  = co0  we  have  resonance. 

But  in  practice  we  know  that  resonance  sets  in  when  to  = y > 


co0 
— > 


3 


COq 

4 


and  so  on.  The  important  thing  here  is  that  the  exciting  force 


should  be  periodic  but  harmonic ! A force  / = cos  does  not  excite 


an  oscillator  with  natural  frequency  to0.  Let  us  take  another  typi- 
cal example.  Suppose  an  oscillator  with  frequency  to  0 is  excited  by 
separate  impulses : the  best  way  is  to  provide  one  impulse  per  oscilla- 
tion. The  best  time  is  when  x = 0,  — > 0,  — is  maximal;  the 

dt  dt 

work  of  the  force  will  then  also  be  a maximum  (we  assume  that 
the  force  acts  in  the  direction  of  increasing  x).  Clearly  if  the  force  acts 
once  each  period,  then  the  frequency  of  the  force  coincides  with  the 
frequency  to0  of  the  oscillations. 

Now  change  the  action  of  the  force.  Suppose  the  force  acts  ever 
other  time,  say  every  odd  oscillation.  It  is  clear  that  the  oscillations 
will  build  up  indefinitely  this  way  too  and  we  will  have  resonance, 
although  co  = co0/2.  The  crux  of  the  matter  here  is  that  the  force 
/ = 2S(£  — nT)  (n  an  integer)  consisting  of  separate  impulses  deter- 
mined by  the  interval  T is  particularly  rich  in  overtones.  In 

the  first  case  T = — , in  the  second  case  of  infrequent  impulses 
w0 

T = 2 • — ==  — the  frequency  of  the  force  co  = co0/2,  but  in  the 

6i0  CO 

second  case  as  well  we  have  a term  involving  e™*1  — ei2vit  in  the  expan- 
sion of  f(t).  This  is  due  to  the  presence  of  higher  frequencies.  Now 
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if  a force  is  acting  with  frequency  y , then  it  consists  of  a mixture 
of  harmonics  with  frequencies  , 2 -y  , 3 -y  , 4 -y  , and  so  on.  Of 

these,  it  is  precisely  the  third  with  frequency  3 y = co0  that  produces 
resonance. 

On  the  contrary,  a periodic  external  force  with  frequency  2oo0, 
3co0, ...  does  not  produce  resonance  but  only  generates  higher  harmonics 
in  the  oscillator  under  consideration. 

Exercise 

Expand  in  a Fourier  series:  (a)  the  function  A |sin  <x.t\ ; (b)  a 

periodic  function  with  period  T and  equal  to  ht  for  0 < t < T ; 

(c)  £ AS(t  — 2nT). 

n=  — oo 

14.7  Hilbert  space 

The  expansions  given  in  Sec.  14.6  have  a marvellous  geometrical 
interpretation.  We  have  already  pointed  out  (in  Sec.  9.6)  that  any 
collection  of  entities  in  which  linear  operations  can  be  performed 
(these  operations  are:  addition  of  the  entities  and  multiplication  by 
scalars)  may  be  interpreted  as  a multidimensional  vector  space.  Now 
such  operations  can  be  performed  on  functions  f(t)  considered  over  a 
certain  interval  of  the  £-axis  (this  interval  may  be  finite  or  infinite  and, 
in  particular,  it  may  coincide  with  the  entire  axis,  but  it  must  be  the 
same  for  all  functions  considered) ! What  this  means  is  that  these 
functions  may  be  regarded  as  vectors:  they  form  a vector  space. 
However,  characteristic  of  a function  space  is  the  fact  that  it  is  infinite- 
dimensional. This  is  because  when  choosing  an  arbitrary  function  / 
there  is  an  infinity  of  degrees  of  freedom  because  if,  say,  we  indicate 
any  finite  number  of  values  /(/*),  then  this  function  can  also  be  chosen 
in  many  ways  for  the  remaining  values  of  t. 

Let  us  first  assume  that  the  functions  at  hand  take  on  real  values. 
We  can  introduce  into  a function  space  the  scalar  product 

b 

if,  ?)  = ^/(0  ?(0  dt  (44) 

a 

where  a and  b are  the  endpoints  of  the  interval  of  the  t- axis  over  which 
the  functions  are  considered.  The  natural  character  of  this  formula 
is  a consequence  of  the  following  reasoning.  Choose  n values  of  the 
independent  variable 

t — t±y  toy  ...,  tn 
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and  consider  the  values  of  any  function  /solely  at  these  points  f(tx)t 
f(t^j,  ...,  f(tn).  We  will  then  have  n degrees  of  freedom,  that  is,  these 
functions  form  an  w-dimensional  vector  space,  in  which  (by  Sec.  9.6) 
a scalar  product  can  be  introduced  via  the  formula 

(/.  ?)=£/&)?(<*)  (45) 

k=  1 

As  the  sample  values  tk  become  denser,  the  ^-dimensional  space  be- 
comes an  infinite-dimensional  space,  and  the  sum  (45)  naturally  turns 
into  the  integral  (44). 

By  formula  (44)  and  according  to  the  rules  of  vector  algebra  we 
can  introduce  the  “modulus  of  the  vector /”,  which  is  called  the  norm 
of  the  function  /and  is  denoted  by  [|/||: 

b 

Wf\\2  = (fj)=\um2dt  (46) 

a 

The  space  of  functions  with  scalar  product  (44)  and,  hence,  with  norm 
(46)  is  called  a Hilbert  space  (more  exactly,  a real  Hilbert  space  of 
functions  on  a specified  interval  with  endpoints  a , b).  It  includes  con- 
tinuous, discontinuous,  and  even  unbounded  functions.  But  since 
every  vector  must  have  a finite  modulus,  it  follows  that  the  space 
includes  only  those  functions  for  which  the  integral  (46)  is  either  a 
proper  integral  or  an  improper  convergent  integral  (see  Sec.  3.1), 
that  is  to  say,  such  as  of  necessity  has  a finite  value.  In  particular,  the 
delta  function  is  not  an  element  of  Hilbert  space  because  the  integral 
of  its  square  is  equal  to  infinity  (why?).  In  what  follows  we  will  confine 
ourselves  to  functions  with  a finite  norm. 

The  idea  of  orthogonality  is  naturally  introduced  into  Hilbert  space : 
two  functions  gft)  and  g2{t)  are  orthogonal  to  each  other  (over  an 
interval  with  endpoints  a,  b)  if  their  scalar  product  is  equal  to  zero, 
i.e.  if 

b 

$&(<)&(<)  * = 0 (47) 

a 

Two  orthogonal  functions  are  similar  to  two  perpendicular  vectors. 
If  there  is  an  orthogonal  system  of  functions , that  is  to  say,  a collection 
of  pairwise  orthogonal  functions 

4l(t)>  &(<).-.  Sn(t), ...  (48) 

then  the  problem  often  arises  of  expanding  any  given  function  f(t) 
in  terms  of  these  functions,  that  is,  an  expansion  of  f{t)  in  a series  like 


(49) 
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In  Ch.  9 we  considered  the  problem  of  resolving  a vector  in  ordi- 
nary (three-dimensional)  space  into  orthogonal  (i.e.  perpendicular) 
vectors.  If  there  are  three  such  orthogonal  vectors,  then  any  vector  can 
be  resolved  in  terms  of  them : such  a triad  of  vectors  is  said  to  be  com- 
plete and  we  can  take  it  for  the  basis  of  that  space.  Now  if  there  are 
two  orthogonal  vectors,  then  only  vectors  lying  in  the  plane  of  these 
two  vectors  can  be  resolved  in  terms  of  the  two  vectors.  In  three-dimen- 
sional space,  such  a pair  of  vectors  is  not  complete,  it  becomes  complete 
only  after  adjoining  a third  vector. 

Similarly,  the  orthogonal  system  of  functions  (48)  is  said  to  be 
complete  if  any  function  j(t)  can  be  expanded  in  a series  of  the  form 
(49)  in  terms  of  that  system ; any  such  system  of  functions  forms  a 
basis  in  Hilbert  space.  If  the  system  (48)  is  incomplete,  then  not  all 
functions  can  be  resolved  in  terms  of  the  system,  but  only  those  func- 
tions that  satisfy  definite  relations.  It  turns  out  that  any  incomplete 
orthogonal  system  of  functions  can  be  extended  to  form  a complete 
orthogonal  system  by  adjoining  to  the  original  system  a certain 
number  (which  may  even  be  infinite!)  of  functions. 

In  a finite-dimensional  vector  space  it  is  very  easy  to  determine 
the  completeness  or  incompleteness  of  a system  of  orthogonal  vec- 
tors : if  the  number  of  vectors  in  the  system  is  equal  to  the  dimension- 
ality of  the  space,  then  the  system  is  complete,  but  if  the  number  is 
less  than  the  dimensionality  of  the  space,  then  the  system  is  incomplete. 
In  contrast,  in  an  infinite-dimensional  space,  even  an  incomplete 
system  may  contain  an  infinitude  of  vectors  so  that  the  complete- 
ness of  the  system  cannot  be  determined  by  that  number  alone. 
Ordinarily,  it  is  not  at  all  simple  to  establish  the  completeness  of  the 
system  (48). 

But  if  the  completeness  of  the  system  of  functions  (48)  has  been 
established  in  some  way,  then  it  is  very  easy  to  find  the  coefficients  of 
the  expansion  of  the  given  function  f{x ) in  the  series  (49).  To  do  this, 
form  the  scalar  product  of  both  sides  of 

f{t)  = + «2g2W  4-  + - (50) 


by  one  of  the  functions  gk(t).  Then  by  virtue  of  the  orthogonality  re- 
lation, all  terms  in  the  right  member  of  (50)  vanish  except  one,  in 
which  the  function  is  multiplied  into  itself.  We  then  get  the  equality 
(/.  gk)  = «*(£*>  gk),  or 


\m  gk(t)  dt 

- = (/.  gk)  __  a 

* (gk>gk)  b 


(51) 


(cf.  formula  (28)  of  Ch.  9). 
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An  important  example  of  a complete  orthogonal  system  of  func- 
tions on  a finite  interval  0 t ^ T is  given  by 

2iz  2tz  4tz  4tz  6k  6k 

1,  cos  — ty  sin  — t,  cos  — t,  sin  — t,  cos  — <,  sin  — t,  ...  (52) 

The  orthogonality  of  this  system  can  be  obtained  by  direct  computa- 
tion of  integrals  of  the  type  (47)  (see  Exercise  1).  The  completeness 
of  this  system  was  actually  demonstrated  in  Sec.  14.5,  since  the  expan- 
sion in  terms  of  the  system  (52)  is  nothing  but  the  expansion  (42),  and 
we  have  seen  that  it  is  possible  for  any  function  f(t)  (since  a periodic 
function  with  period  T can  assume  absolutely  arbitrary  values  for 
0 < t < T),  If  we  compute  the  coefficients  of  the  expansion  in  terms 
of  the  system  (52)  with  the  aid  of  the  general  formula  (51)  and  the 
easy-to-verify  equations 

T T T 

i l2  dt  = T,  [ cos2  tdt  = —>  l sin2  1 dt  = — 

J J T 2 J T 2 

0 0 0 

(k=  1,2,...) 

we  get  the  formulas  (43). 

An  interesting  generalization  of  the  Pythagorean  theorem  to 
Hilbert  space  results  if  we  take  the  scalar  product  of  the  left  and  also 
the  right  member  of  (50)  into  itself.  Then  in  the  right-hand  member  all 
the  pairwise  scalar  products  are  equal  to  zero  by  virtue  of  the 
orthogonality  relation,  leaving  only  the  scalar  squares  of  all  terms, 
and  we  get 

(/./)  = «!(gi.  gi)  + «1  (g&  gz)  + ... 

or 

ll/ll2  — ai  II  &i  ;j2  + a\  II  &2  II2  + al  II  S3  II2  + (53) 

On  the  left  we  have  the  square  of  the  modulus  of  the  vector  /,  on  the 
right,  the  sum  of  the  squares  of  its  projections  on  the  basis  vectors 
Si>  S2> 

If  we  consider  a complex  Hilbert  space,  which  means  the  functions 
of  a real  argument  can  assume  complex  values,  then  the  formula  for 
the  scalar  product  will,  instead  of  (44),  have  the  form 

6 

(/,  ?)  = $/w  [?mm  ^ 

a 

where  the  asterisk  denotes  a complex  conjugate  quantity  (see  Secs.  5.2 
and  9.6).  For  this  reason  the  formula  for  the  norm  will  also  change: 

b b 

\\f  II2  = (/.  /)  = $/(0  [/(*)]*  dt  = J | f(t)  I*  dt 

a a 


39  - 1634 
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Similar  changes  will  occur  in  other  formulas  as  well.  An  important 
instance  of  a complete  orthogonal  system  of  functions  in  complex 
Hilbert  space  over  the  interval  0 t ^ T is  the  set  of  functions 


47 ri 


(54) 


The  expansion  in  terms  of  this  system  of  functions  is  precisely  the 
expansion  (40)  of  Sec.  14.6. 

Such  a geometric  approach  to  the  set  of  functions  enables  us  to 
obtain  many  important  and  interesting  consequences  that  lie  beyond 
the  scope  of  this  book. 

We  conclude  this  section  with  a few  words  on  the  development  of 
the  concept  of  function.  In  the  18th  century,  when  the  concept  first 
appeared,  a function  was  conceived  as  of  necessity  being  specified  by 
a formula.  For  this  reason,  the  case  of  a Fourier  expansion  of  a function 
that  is  specified  by  distinct  formulas  over  different  ranges  of  the  ar- 
gument (see  Sec.  14.6)  was  a puzzle  to  mathematicians  for  some  time: 
was  this  one  function  (it  was  a single  series  and  it  was  formed  via  a very 
definite  law)  or  was  it  several?  An  analysis  of  similar  instances  led, 
in  the  19th  century,  to  the  presently  accepted  definition  of  a function 
as  an  arbitrary  law  of  correspondence  between  dependent  and  inde- 
pendent variables.  This  approach  turned  out  to  be  useful  for  the  logi- 
cal substantiation  of  mathematics  as  a whole. 

However,  from  the  viewpoint  of  applications  this  definition  is 
too  amorphous  and  hazy.  Such  functions  had  the  right  to  exist  as, 
say,  the  Dirichlet  function  that  is  equal  to  0 for  irrational  values  and 
to  1 for  rational  values  of  the  independent  variable  (try  to  imagine 
the  graph  of  such  a function !)  and  other  similar  functions  that  appear 
to  have  meaning  only  in  a formally  logical  sense.  In  applications, 
a function  is  called  upon  to  constitute  a working  organism  and  not 
a disorganized  hodgepodge  of  values.  Today  we  have  a particularly 
clear  view  of  the  role  of  functions  specified  by  formulas,  to  be  more 
exact,  analytic  functions  (Ch.  5). 

But  the  logical  analysis  of  the  function  concept  that  was  carried 
out  in  the  19th  century  was  also  fertile  as  far  as  applications  were 
concerned.  Thus,  functions  specified  by  several  formulas  (piecewise 
analytic  functions)  are  frequently  encountered  in  applications  and  no 
longer  cause  discomfort  (the  most  elementary  case  being  the  unit 
function  e(x)  in  Sec.  6.3).  Their  position  was  still  more  clarified  after 
the  discovery,  in  the  20th  century,  of  generalized  functions:  the 
Dirac  delta  function  and  associated  functions  (a  mathematically  ri- 
gorous theory  of  generalized  functions  was  developed  in  1936  by  the 
Soviet  mathematician  S.L.  Sobolev).  For  instance,  the  function  f(x) 
that  is  equal  to  ffx)  for  % < a and  to  f%(x)  for  x > a may  be  written 
down  as  a single  formula: 

fix)  =fi[x)  e(a  - x)  +f2{x)  e(x  - a) 
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Apparently,  in  applications  today  only  the  following  functions  have 
an  individual  (“personal”)  value:  analytic,  piecewise  analytic,  and 
simple  generalized  functions.  (In  the  theory  of  random  processes  of 
the  Brownian  motion  type,  an  important  role  is  played  by  continuous 
nowhere  differentiable  — and  hence  nonanalytic  — functions  that  de- 
scribe particle  trajectories ; but  these  functions  cannot  be  reproduced 
and  so  only  have  a probabilistic  value  and  not  an  individual  value). 

The  role  played  by  singularities  of  a function  — in  particular  those 
that  arise  when  a function  is  patched  together  with  different  formulas 
— comes  to  the  fore  when  passing  to  complex  values  of  the  independent 
variable.  Consider,  for  example,  the  function 


f{x)  = x2e{x)  = |°2 


(—  oo  < x < 0), 
(0  < x < oo) 


shown  in  Fig.  200.  The  two  portions  of  f(x)  have  been  patched  to- 
gether at  x — 0 with  the  continuity  of  the  derivative  observed.  This 
function  can  be  approximately  represented  in  the  form 


/(*)  = 


X2 
1 + 


where  a is  very  great.  But  if  we  allow  for  the  independent  variable 
assuming  complex  values,  then  the  right-hand  side  will  have  poles 
at  1 + e~xx  = 0,  or 


x = ^-(2k  + 1 )i  (k  = 0,  ±1,  ±2,...) 

2a 


as  a-*oo,  these  poles  fill  the  entire  imaginary  axis,  which  separates 
the  ranges  of  the  two  formulas  for  the  function. 

As  we  have  seen  in  Sec.  14.4,  “patching”  is  also  seen  under  har- 
monic analysis  of  a function  in  the  asymptotic  behaviour  of  its  Fourier 
transform. 

Characteristic  of  the  20th  century  is  yet  another  approach  to  the 
notion  of  function,  namely  as  an  element  of  a function  space,  for 
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example,  of  Hilbert  space,  that  is,  as  a member  of  a functional  assem- 
blage. Such  an  approach  has  certain  diverse  theoretical  and  applied 
advantages  in  many  problems,  but  they  lie  outside  the  scope  of  this 
text. 

Exercises 

1.  Prove  that  the  system  of  functions  (52)  is  orthogonal  on  the  inter- 
val 0 ^ t ^ T,  that  the  same  system  is  nonorthogonal  on  the 

T 

interval  0 ^ t ^ — ; that  the  same  system  on  the  interval 

0 ^ t ^ 2T  is  orthogonal  but  incomplete. 

2.  Prove  that  the  system  of  functions  (54)  is  orthogonal  on  the  inter- 
val 0 ^ t ^ T. 

14.8  Modulus  and  phase  of  spectral  density 

In  Sec.  14.2  we  pointed  out  that  the  simplest  receptors  of  oscilla- 
tions (the  ear,  the  eye,  a photographic  plate)  record  only  the  absolute 
amplitude ; the  readings  of  these  receptors  do  not  depend  on  the  phase 
of  the  oscillations.  This  approach  to  oscillations  is  characteristic  of  the 
19th  century  with  its  interest  in  energetics,  since  the  energy  of  oscil- 
lations is  determined  solely  by  the  modulus  of  the  amplitude,  to  be 
more  precise,  by  the  modulus  of  the  spectral  density.  If  the  oscillation 
is  described  by  a real  function  f(t),  the  energy  flux  for  distinct  types 
of  oscillations  is  proportional  to  [ f(t )]2  or  [/'(£)]2  so  that  the  total 
oscillation  energy  during  time  — oo  < t < oo  is  expressed  in  terms  of 
the  integrals  70  or  I1: 

OO  CO 

h=  /,  = $ wm*dt 

— 00  — CO 

But  these  integrals  are  readily  expressible  in  terms  of  the  square 
of  the  modulus  of  the  spectral  density.  To  obtain  this  expression  for 
70,  square  both  sides  of  (4)  and  then  combine  both  integrals  into  one 
in  the  righthand  member: 

[/(/)]*  = J ( F{u)eiai  du  = ^K)  F(®2)  eiiai+ai)tdc o,  <fco2 

Now  integrate  both  sides  with  respect  to  t and  take  advantage  of  the- 
formulas  (5),  (9),  and  also  of  (4)  of  Ch.  6: 

70  = J [/(O]2  it  = JgFK)  F{ co*)  e^^dt  d<*x  do2 

= 2n  ^T^oq)  F(<o2)  ^(co!  -f-  co2)  do1  d<xi2 
= In  f F(— co2)  F(  o>2)  rfo)2  = 2ni  ' F(  o>)  |2  doy 
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Similarly,  since  iu>F  serves  as  the  Fourier  transform  of  the 
function  /'(/),  it  follows  that 

/,  = ^ [/'(/)]2  dt  = 2n  ^ | (co)  |2  rfto  = 2r  ( co2 1 F(o>)  j2  do> 

The  corresponding  formulas  for  periodic  functions  with  period  T 
were  actually  derived  in  Sec.  14.7  (see  (53)) : 

T 

h = \m?dt  = t t Ki2 

J A- -co 

0 

where  ak  are  the  coefficients  of  the  expansion  of  the  function  f{t)  in 
the  Fourier  series  (40).  Here  it  suffices  to  find  the  energy  during  a 
single  period,  and  in  place  of  the  integral  with  respect  to  the  frequency 
we  have  a sum.  The  expression  for  Ix  is  similar. 

The  two  expressions  of  I — in  terms  of  the  integral  with  respect 
to  time  or  the  integral  (sum)  with  respect  to  the  frequencies  — can 
be  visualized  as  the  expression  of  the  square  of  the  modulus  of  a vector 
by  the  Pythagorean  theorem  as  the  sum  of  the  squares  of  its  component 
(cf.  the  interpretation  of  formula  (53)  in  Sec.  14.7).  Here,  there  are 
different  expressions  because  we  make  use  of  different  systems  of  coor- 
dinates: in  one,  the  coordinates  are  values  of  the  function  at  distinct 
points,  + A/),  ...,  in  the  other,  the  coordinates  are  Fourier 

coefficients.  The  equality  of  the  two  expressions  for  I shows  that  each 
of  them  is  complete,  not  a single  one  of  the  components  of  the  vector 
is  forgotten. 

Thus,  the  expression  for  energy  does  not  depend  on  the  phase  of 
the  oscillations:  replacing  F( <o)  by  F( to)  eia^\  where  a{ o>)  is  a real 
function,  does  not  alter  the  integral  I.  Not  only  the  total  energy  re- 
mains unchanged,  but  also  the  energy  obtained  by  an  oscillator  tuned 
to  one  or  another  frequency.  In  the  energy  approach,  the  phase  of 
oscillations  is  not  essential. 

In  this  20th  century,  particularly  in  the  latter  half,  special  im- 
portance is  attached  to  the  transmission  of  information  — from  radio 
and  television  to  cybernetic  systems  of  control  and  research  problems. 
It  is  clear  that  we  lose  information  when  we  fail  to  record  the  phase : 
if  different  f(t)  correspond  to  the  same  spectrum,  with  respect  to 
| F |2,  with  distinct  phases,  then  when  the  phase  is  unknown  we  cannot 
reproduce  f{t)  if  all  we  know  is  |F|2.  Writing  |F|  and  the  phase, 
we  would  obtain,  to  put  it  crudely,  twice  as  much  information  from 
the  given  wave.  What  is  more,  it  appears  possible  to  transmit  informa- 
tion via  changes  in  the  phase  of  the  oscillations  at  a fixed  amplitude. 

There  are  two  modes  of  transmitting  information  in  radio  engineer- 
ing: by  means  of  amplitude  modulation , (Fig.  201), 

f(t)  = a(t)  cos  o)Qt 
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Fig.  201 


Fig.  202 
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and  by  means  of  frequency  modulation,  (Fig.  202), 

t 

f(t)  ~ a0  cos  + c ^ b(t)  dt j 

The  information  to  be  transmitted  is  contained  in  the  func- 
tion a(t)  in  the  former  case  and  in  the  function  b(t)  in  the  latter. 
From  the  viewpoint  of  harmonic  analysis,  that  is,  the  expansion  of 
f(t)  into  the  Fourier  integral,  in  both  cases  we  have  to  do  with  a 
spectrum  in  which  F'(co)  is  nonzero  in  the  neighbourhood  of  <o0,  which 
is  a quantity  called  the  carrier  frequency . 

In  the  case  of  amplitude  modulation,  to  determine  the  spectrum 
we  expand  a(t) : 

a{t)  = ^ A (w)  eiat  do 

Clearly,  if  A (t)  is  nonzero  only  in  the  band  — A < to  < A,  the  expansion 
of  /(/)  will  be  concentrated  in  the  band  of  frequencies  of  the  same  width, 
i.e.  -F(co)  0 for  o0  — A<co<co0  + A (and  of  course  for 
— co0  — A<co<  — co0  + A which,  by  virtue  of  (5),  does  not  yield 
anything  new  for  the  frequencies).  This  width  does  not  depend  on  the 
absolute  value  of  the  amplitude. 
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In  the  case  of  frequency  modulation  the  instantaneous  value  of 
the  frequency  is  equal  to  the  derivative  of  the  phase : 

t 

co(^)  = — [ov  + c ^ b{f)  *J  = cOq  -\-  cb(t) 

However,  by  the  uncertainty  principle  (Sec.  14.5),  we  must  have 
a sufficient  time  interval  t for  the  frequency  to  change  substantially 
and  for  this  change  to  be  recorded: 

T > (Cb(t))-1 

where  b is  to  be  understood  as  (b2)112.  The  corresponding  time  of 
variation  of  the  function  b(t)  itself  depends  on  the  frequency  of  the 
signal:  this  time  is  equal  to  t = 1/A.  Thus,  if  cb{t)j A > 1,  then  the 
width  of  the  band  used  for  transmission,  that  is,  the  width  of  the  spec- 
trum F(u>)  of  the  function  f(t)  is  equal  to  cb(t)  and  is  greater  than  the 
width  of  the  frequency  signal  A. 

Under  certain  conditions,  a large  width  of  the  radio  signal 
being  transmitted  has  an  advantage  and  permits  reducing  inter- 
ference. 

Figs.  201  and  202  show,  in  particular,  that  the  two  functions  f(t) 
with  similar  spectrum  and  similar  relationship  |F(g>)  |2  may  appear  to 
be  quite  different.  This  difference  is  due  to  the  difference  of  the  phases, 
i.e.,  of  the  function  9(0))  in  the  expression  F(co)  = V|F(a>) 

ANSWERS  AND  SOLUTIONS 


Sec.  14.1 

1.  F(  to)  = — CO*). 

k 

2.  This  occurs  if  all  coA  are  commensurable,  i.e.,  co*.  = nk oc,  where  a 
does  not  depend  on  k and  all  nk  are  integers.  Then  all  terms,  and 

so  also  the  sum,  have  period  T = — • 

a 


Sec.  14.2 


1 

OO 

0 

00 

f e~iut  dt  = — 1 

' f 

eat-icii  dt  + 

2k 

J 2*1 

3 

— 00 

— 00 

0 

1 

( 1 1 )_ 

1 

a 

2k 

(a  — — a — uo  ) 

K 

a2  + o>2 

If  we  put  a = — , we  get  precisely  one  of  the  examples  from 

m 

which  we  obtained  the  delta  function  in  Sec.  6. 1 as  m ->  oo 
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(i.e.  a -*■  + 0).  Hence,  for  f(t)  = 1 it  will  be  true  that  F( o>) 
= S(o>).  The  formula  (4)  yields 


( - — - 
J 7 r a2  + 


eiu a dco 


That  is. 


r a cos  /co 
J a2  + op 


tfoo  = ne~a^ 


When  a = 1 we  get  formula  (1)  of  Ch.  1 in  different  notation. 
2.  From  (17)  we  get 

x(t)=  I 

J m(oo§  — CO2)  + 200/2 


where 


uu 

F(w)  = ~ ^ /(/)  dt 


Replacing  the  variable  of  integration  t by  t in  the  last  integral, 
then  substituting  this  integral  into  the  first  one,  and  finally 
changing  the  order  of  integration,  we  get,  after  some  manipulation, 


uu  uu 

m = ^ 5 /(t)  dr  J 


m(oo2  — co2)  + 200/2 


The  inner  integral  can  be  evaluated  by  the  methods  of  Sec.  5.9. 
The  integrand  has  simple  poles  for 

W = Wl,2  — — ±|/  " -^-2  + ^0  ~ ^7  ± 000 

2 m y 4m2 

Both  of  these  values  are  found  in  the  upper  half-plane. 

For  t > t we  find,  using  the  upper  semicircle  [ oo  | = R ->  00, 
that  the  inner  integral  is  equal  to 


Zm  1 g " — + - * ~ = sin  co0^  — t) 

[— 2mcoi  + ih  — 2mco2  + ih  moo0 

For  t < t we  have  to  use  the  lower  semicircle,  which  leads  to  the 
integral  vanishing.  Thus, 


1 :(t)  -2—  f e-r(t-r)  sin  to°(2  — t)  /(t)  dr 

mo>0  J 
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Sec.  14.4 

1.  The  image  of  the  function  f(at)  is 

oo  aco  .co 

±^f(at)e-«“dt  = ± $ = 

— oo  ~aoo 


2. 


with  the  substitution  (at  = tx)  for  a > 0 ; for  a < 0,  we  have 
to  invert  the  limits  of  integration,  which  yields 


= — f(-). 

a ^ a ) | a\  \ a } 

m 1 a 1 • a ICO 

£-if0(co-P) — , — fcco == • 

tx  ot2  + (co  - P)2  a a2  + co2  a2  + co* 


Sec.  14.5 

1.  Since  a shift  along  the  t- axis  does  not  affect  the  uncertainty  rela- 
tion, we  put  f(t)  — C (|*|  < h),  f(t)  — 0 (|  * | > h)  so  that  At 

= 2 h.  Then  by  (10),  F(c o)  = C -sin  °*-h-  - For  the  width  of  the 

toh 

function  obtained  we  take  the  width  of  the  middle  “hump”,  i.e. 

the  distance  between  adjacent  zeros.  The  Aco  = — , whence 

h 

A t*  A co  = 2h  — = 2-rz  = constant. 
h 

2.  If  in  some  way  we  have  determined  the  width  A t of  the  function 
f(t)  and  the  width  Aco  of  its  density  F( co),  then  the  width  of 

the  function  f(at)  is  equal  to  — and  the  width  of  its  den- 
ial 

sity  — I is  equal  to  | a | Aco.  Here,  — . I a \ Aco  = At-  Ago, 

\a\  \a)  \a\ 

that  is  to  say,  it  does  not  depend  on  a. 


Sec.  14.6 


, . 2 A v-n  *A 

(a) — 

7T  - - - 


4A  cos  2k<xt 


(b)  -y  - — E 

2 TC  k = l 


k=l  77  Ak2  — 1 


sin- 


2kizt 


(C) 


— -h  — ^2  cos  — * The  period  of  the  kth  harmonic  is  equal 
2T  rjfei  t v H 

. 2r 

to  • 

k 
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Sec.  14.7 


For  m =£  n we  have 


2miz  . 2mz  . 

l cos 1 * cos ^ at 

I r r 


Iff  2(m  + «)  7:  , . 2(m  — «)  7r  ,, 

= — V cos  — — t + cos  — — / at 


1 r T . 2(m  4-  w)  7T 

— —I sin  — — t 

2 |_  2(m  + n)  n T 

T . 2 (m  — n)  it  n 

-f sin  — - — t\  = 0 

2(m  — n)  7T  T Jf=o 

Here,  m = 0 is  possible ; then  cos  -2m™  t = 1 . In  similar  fashion 

we  can  verify  the  orthogonality  to  each  other  of  the  sines,  and 

T 

also  of  the  sine  and  cosine.  If  the  integral  is  taken  from  0 to  — » 

then  the  sines  are  no  longer  orthogonal  to  the  cosines  (the  inte- 
gral turns  out  to  be  nonzero).  For  the  interval  0 ^ t ^ 2T  the 
integrals  are  also  zero,  but  since  all  functions  (52)  are  periodic 
with  period  T , only  that  function  can  be  expanded  (in  terms  of 
these  functions)  whose  values  on  the  portion  T ^ t ^ 2T  are 
an  exact  repetition  of  its  values  on  the  portion  0 ^ t ^ T. 
For  m =£  n it  will  be  true  that 


T 2mni  f 2nni  'l*  T 

jr*" [e~)dt  = \ 


T 2(m  — n)ni 


2(m  — n)ni 


dt  = e 

2 (m  — n)  7 zi 


Chapter  15 

DIGITAL  COMPUTERS 


In  conclusion  at  least  a few  words  are  in  order 
concerning  the  electronic  digital  computers  that 
have  wrought  a revolution  in  modern  applied 
mathematics,  and  even  beyond. 

The  most  elementary  computing  devices  like 
the  slide  rule,  the  abacus,  the  desk  calculator, 
and  tables  have  been  in  use  for  a long  time. 
But  they  fall  far  short  of  the  demands  made 
by  modern  engineering,  science  and  economics. 
Many  important  problems  are  solvable  in  prin- 
ciple but  require  such  an  enormous  amount 
of  computation  that  their  solution  by  the  above- 
mentioned  devices  cannot  be  obtained  in  rea- 
sonable periods  of  time.  In  order  to  obtain  solutions,  one  had  to 
forego  many  essential  factors  and  this  led  to  quantitative  and 
offtimes  fundamental  errors. 

The  search  for  more  effective  computational  tools  and  the 
attainments  of  modern  technology  led  to  the  construction  of  elec- 
tronic digital  computers  whose  speed  exceeds  that  of  the  classical 
tools  by  many  orders  of  magnitude.  Although  the  underlying  idea 
of  such  machines  had  been  advanced  in  the  19th  century,  only  the 
development  of  modern  electronics  made  their  realization  a possi- 
bility. The  first  electronic  digital  computer  was  constructed  in  the 
U.S.A.  in  1943,  and  in  1946  the  American  mathematician  J.  von 
Neumann  (1903-1957)  formulated  the  basic  ideas  and  principles 
for  constructing  such  machines.  The  widespread  introduction  of 
computers  made  it  possible  to  solve  a series  of  important  problems 
and  extended  appreciably  the  range  of  fields  in  which  mathematics 
and  the  allied  science  of  cybernetics  achieved  useful  results.  There 
can  be  no  question  that  the  further  development  and,  particularly, 
the  expanded  use  of  calculating  machines  will  in  this  generation 
result  in  a fundamental  reorganization  of  scientific  research,  techno- 
logy, economic  calculations,  management,  and  so  forth. 

15.1  Analogue  computers 

Suppose  a certain  problem  has  been  reduced  to  the  solution 
of  a mathematical  equation.  For  short  we  will  call  the  quantities 
that  enter  into  the  equation  mathematical  quantities  (although 
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they  may  have  a definite  physical  meaning,  which  was  why  the 
equation  was  under  consideration  in  the  first  place). 

There  are  two  basic  modes  of  representing  mathematical  quan- 
tities that  are  to  be  operated  on  computationally.  In  one  mode, 
these  quantities  are  represented  directly  by  the  physical  quantities 
of  length,  angle,  electric  voltage,  and  so  forth,  irrespective  of  the 
meaning  that  the  original  physical  problem  had.  Then  a physical 
scheme  is  outlined  in  which  the  physical  quantities  are  transformed 
via  the  same  law  as  the  mathematical  quantities  are  to  be.  A com- 
puting machine  based  on  such  a principle  is  an  analogue  computer , 
the  simplest  instance  of  which  is  the  slide  rule,  where  addition  of 
mathematical  quantities  (the  logarithms  of  the  factors)  are  simulated 
by  the  addition  of  lengths.  Here  is  another  example:  to  compute 
an  integral  we  can  simulate  the  integrand  by  the  law  of  opening 
a water  faucet ; then  the  integral  itself  will  correspond  to  the  total 
volume  of  water,  which  can  be  measured  with  ease.  The  most 
common  analogue  machines  are  those  based  on  electrical  analogies. 

In  the  second  mode,  a device  is  made  to  represent  in  digital 
notation  the  mathematical  quantities  involved;  the  operations  on 
these  quantities  are  then  replaced  by  arithmetical  operations  on  the 
digits  (numbers).  Computing  machines  based  on  this  principle  are 
called  digital  computers . They  include  the  abacus  and  the  desk 
calculator.  The  biggest  advances  have  been  made  by  digital  machines, 
but  for  the  sake  of  completeness  we  will  take  a brief  glance  at 
analogue  computers  as  well. 

First  of  all,  the  input  parameters , that  is,  the  operands , or  quan- 
tities that  are  to  be  operated  on,  ordinarily  are  capable  of  varying 
continuously  over  certain  ranges;  hence,  this  class  of  machines  is 
termed  continuous-mode  machines . True,  the  input  parameters  could 
possibly  be  in  the  form  of  electric  resistance  that  is  varied  discrete- 
ly with  the  aid  of  a resistance  box;  such  discreteness  would  not 
be  of  a fundamental  nature,  but  only  due  to  the  design  of  the 
machine,  whereas  digital  machines  are  fundamentally  machines  of 
discrete  operation  [discrete-mode  machines). 

Furthermore  it  is  clear  that  the  precision  of  input  of  the  parame- 
ters and  that  of  the  results  is  not  great  in  the  case  of  analogue  machines. 
Ordinarily  it  is  of  the  order  of  percents,  at  best,  tenths  of  a percent. 
This  naturally  restricts  the  possibilities  of  simulating  involved  com- 
putations. Besides,  analogue  machines  are  usually  rather  specialized, 
adapted  to  the  solution  of  a definite  narrow  class  of  problems.  But 
if  such  specialized  problems  have  to  be  solved  over  and  over  again 
and  high  accuracy  is  not  required,  then  the  use  of  analogue  machines 
and  devices  proves  to  be  extremely  effective.  For  example,  electrical 
integrators  for  the  solution  of  systems  of  ordinary  differential  equations 
are  in  wide  use. 
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It  often  happens  that  one  and  the  same  relationship  of  quantities 
can  be  effected  via  different  physical  schemes.  This  justifies  the 
simulating  of  physical  processes.  Suppose  we  have  a physical  system 
in  which  it  is  difficult  to  measure  or  compute  some  quantity  S directly. 
We  construct  a new  system  of  a different  physical  nature  in  which 
the  quantities  involved  exhibit  the  very  same  functional  relationship, 
and  then  the  quantity  corresponding  to  5 is  measured  in  the  new 
system.  What  is  more,  if  the  mathematical  equivalence  of  the  two 
physical  systems  has  been  established,  then  the  mathematical  solu- 
tion of  the  problem  is  unnecessary  and  so  no  calculations  need  be 
made  at  all,  unless  they  are  needed  for  some  other  purpose.  Actually, 
what  is  utilized  here  is  the  similarity  of  phenomena  based  on  specific 
characteristics  (Sec.  8.10)  with  a dimensional  proportionality  (simi- 
larity) factor.  In  recent  years,  the  analogue  solution  of  problems 
based  on  electromechanical,  optical-mechanical  electrodiffusion,  etc. 
analogies  has  seen  extensive  development. 

If  a process  is  being  studied  and  by  the  physical  meaning  of  the 
problem  the  independent  variable  is  the  time  t,  then  it  often  happens 
that  the  solution  on  an  analogue  machine  is  obtained  as  a function 
of  t.  We  then  say  that  the  problem  is  solved  in  real  time  and  the  solu- 
tion can  be  transferred  directly  (without  human  intervention)  to  the 
system  under  study  for  further  use;  this  is  the  underlying  principle 
of  many  automatic  control  devices.  The  use  of  real  time  also  makes 
it  possible  to  replace  expensive  equipment  with  computing  machines 
when  testing  complicated  engineering  designs.  For  instance,  an  auto- 
matic pilot  need  not  be  tested  in  flight,  which  is  expensive  and  dan- 
gerous, but  in  a test  bed  wTith  the  aircraft  replaced  by  an  analogue 
machine,  which  reacts  to  the  automatic  pilot  as  if  it  were  the 
aircraft. 

15.2  Digital  computers 

Let  us  now  discuss  digital  computing  machines.  We  will  not 
consider  the  more  common  desk  calculators  that  perform  the  four 
operations  of  arithmetic.  These  are  very  useful  devices  but  they  are 
rather  slow  because  of  the  low  speed  of  the  computation  itself  and 
also  because  the  input  data  for  each  arithmetic  operation  are  entered 
manually  by  the  operator.  Both  of  these  drawbacks  are  overcome 
in  the  electronic  digital  computer. 

Fig.  203  is  a block  diagram  of  a digital  computer  showing  the  basic 
units  and  their  interrelationships ; the  information  paths  are  denoted 
by  solid  arrows,  the  control  paths  by  dashed  arrows.  The  memory 
unit  (MU)  has  a definite  number,  say,  512,  of  storage  elements  [loca- 
tions) each  of  which  serves  to  record  a single  number  or  instruction. 
As  we  shall  see  in  Sec.  15.3,  the  notation  of  an  instruction  does  not 
differ  in  any  way  from  that  of  a number.  Some  of  these  elements  are 
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filled  from  the  input  unit,  whereas  others  are  filled  in  the  process  of 
calculation  (the  contents  of  the  storage  elements  may  change  many 
times  during  a computation  or  some  may  not  be  used  at  all).  Guided 
by  programmed  instructions,  the  control  unit  (CU)  sends  numbers 
(or  instructions)  from  the  memory  unit  to  the  arithmetic  unit  (AU), 
which  transforms  them  in  accordance  with  the  instructions  and 
returns  them  to  memory.  On  signals  from  the  control  unit,  required 
results  are  automatically  printed,  and  when  the  computation  is  com- 
pleted, the  control  unit  stops  the  machine. 

By  means  of  accessory  equipment,  a digital  computer  can  give 
the  result  in  the  form  of  a graph  approximated  by  computed  coordi- 
nates of  points  or  even  a moving-picture  display  showing  the  varia- 
tion of  such  a graph. 

A general-purpose  digital  computer  is  capable  of  solving  any 
mathematical  problem  that  has  an  algorithm  (a  clear-cut  sequence  of 
the  procedures  determining  the  computational  process  such  that 
starting  with  the  input  data  leads  of  necessity  to  the  required  result). 
However,  nonmathematical  problems  for  which  a similar  algorithm 
can  be  indicated  are  also  solvable  on  these  machines.  For  example, 
in  metalworking  a digital  computer  can  direct  the  machining  of  a 
workpiece  with  contours  of  any  degree  of  complexity.  The  input  data 
describe  the  shape  and  the  results  of  computation  are  transformed 
into  signals  delivered  to  the  control  unit  that  operates  the  machine 
tool.  The  operation  scheme  is  similar  for  controlling  the  flight  of  an 
aircraft  or  a technological  process.  In  the  case  of  deviations  from  the 
originally  specified  program,  the  machine  can  make  optimal  decisions 
by  comparing  possible  variants,  and  it  can  also  check  the  result. 
Digital  computers  have  found  extensive  use  in  weather  forecasting, 
transport  problems,  and  the  like.  Incidentally,  special-purpose  com- 
puters are  more  effective  in  such  cases  than  general-purpose  machines 
because  they  are  specially  designed  for  the  solution  of  a narrow  range 
of  problems. 
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15.3  Representation  of  numbers  and  instructions  in 
digital  computers 

The  decimal  (denary,  or  base- 10)  number  system  that  we  studied 
in  school  and  use  in  everyday  life  is  not  convenient  when  working 
with  digital  computers.  The  binary  system  (base-2)  is  used  for  number 
notation  in  such  machines;  here  only  two  digits,  0 and  1,  are 
needed  (whereas  in  the  base- 10  system  we  use  10  digits)  and  all 
numbers  are  represented  as  combinations  of  these  digits.  For  instance, 
the  numeral  10100  indicates  powers  of  two  (not  ten),  i.e.,  the  numbers 
two,  four,  etc.  This  can  be  written  as  follows: 

lio  = 12,  210  = 102,  310  = 112,  410  = 1002,  510  = 101, 

where  the  subscript  indicates  the  number  system. 

Any  integer  can  be  written  in  the  base-2  system  merely  by  isolat- 
ing powers  of  two  beginning  with  the  highest  one.  For  example, 
in  base- 10  we  have. 

1971  = 1 ■ 1024  + 1 • 512  + 1*  256  + 1 * 128  + 0 - 64  + 

+ 1-32+1-  16 + 0-  8 + 0-  4+1-2+1-1 

or  19711?  = 111  1011001  la- 
in similar  fashion  using  binary  numbers,  we  can  write  bicimal 
fractions  in  place  of  ordinary  decimal  fractions.  For  example,  take 

the  binary  number  10. 10 1 1 2.  It  is  equivalent,  in  base-10,  to  2 + ~ + 

1 1 43 

H 1 — = — = 2.687510.  Any  nonintegral  number  can  be  written 

8 16  16 

in  the  form  of  a finite  or  infinite  bicimal ; fractions  are  naturally 
rounded  off  in  actual  computations. 

Tables  of  addition  and  multiplication  in  the  base-2  system  are 
very  simple: 

0 + 0 = 0,  1 + 0 = 0 + 1 = 0,  1 + 1 = 10,  (1) 

0-0  = 0,  1 . 0 = 0 • 1 = 0,  1*1  = 1 (2) 

Using  these  tables,  we  can  perform  arithmetic  operations  on  numbers 
written  in  the  binary  (base-2)  system  in  exactly  the  same  way  we 
do  in  the  denary  (base- 10)  system. 

Of  course,  don’t  think  that  a computing  expert  uses  base-2  when 
working  with  a computer.  Programs  are  written  in  base- 10  notation 
and  the  conversion  to  base-2  is  handled  inside  the  computer  by  a 
special  instruction,  which  we  dispense  with  here  so  as  to  simplify  our 
discussion.  Neither  will  we  go  into  the  base-8  (octonary)  and  binary- 
coded  decimal  systems,  which  are  used  in  such  cases.  The  computa- 
tion result  is  printed  by  the  machine  in  the  ordinary  base- 10  notation. 

Numbers  are  entered  in  the  memory  unit  (Sec.  15.2)  and  each 
one  is  placed  in  a storage  element  (location) ; all  storage  elements  are 
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of  the  same  length,  that  is  to  say,  they  have  the  same  number  of 
orders  {digit  positions ),  each  of  which  contains  0 or  1.  Soviet  digital 
computers  have  floating-point  representation , which  means  that  we 
write  the  number  located  between  — 1 and  1,  and  the  power  of  two 
into  which  (power)  the  number  has  to  be  multiplied.  A certain  number 
of  digit  positions  (at  the  end  of  the  storage  element,  for  instance)  are 
allotted  to  the  exponent  and  the  symbol  code  of  the  exponent. 

For  example,  suppose  there  are  six  orders  for  these  digits  in  a 
location  accommodating  30  orders,  the  + sign  represented  as  0, 
the  — sign  as  1.  Then  the  set  of  digits  in  the  storage  location 

symbol  symbol 

code  of  code  of  exponential 

number  fractional  part  exponent  part 

I 1!  l|  o':  l]  ol  lj  l!  o|  o[  oj  o|  o|  o|  oj  o|  o[  o|  o|  o|  o|  ol  o|  0|  oj  o|  o|  o[  l|  o|  1 j P) 

denotes  the  number 

— 0.101011,-  2+m*  = — r.  - + - + - + -)•  24+1l  = — 21.5,0 

L'  2 ^ 8 32^64/  J10  10 

The  largest  number  that  can  be  entered  in  this  fashion  is  of  the 
form 

j o|  l|  lj  lj  l|  1|  lj  l|  l|  l|  lj  l|  l[  l[  lj  l|  l|  l|  l[  l|  l|  l|  l[  l|  o|  l|  l|  l|  l|  lj  W 

and  is  equal  to  (1  — 2-23)  231  = 231  — 28  « 231,  whereas  the  smallest 
positive  number  has  the  form 

j oj  oj  oj  o|  oj  oj  o|  o|  o|  o|  oj  o|  o[  o|  oj  o|  o|  o|  o|  oj  o[  o|  o|  l|  l|  l|  l[  lj  l|  1 j (5) 

and  is  equal  to  2~23-  2-31  = 2“54  (think  this  through !). 

Actually,  the  recording  of  digits  is  done  in  different  ways  depending 
on  the  design  of  the  machine.  So  that  the  numbers  can  be  entered 
in  the  input  unit  of  the  machine  (Sec.  15.2),  they  are  punched  on 
special  tape  ( punched  tape)  or  on  a set  of  special  cards  ( punched 
cards).  For  the  sake  of  definiteness  we  will  discuss  only  cards.  A row 
on  a card  corresponds  to  a storage  location,  the  length  of  the  row  being 
equal  to  the  number  of  orders  in  the  storage  element ; a punched  hole 
corresponds  to  the  digit  1,  the  absence  of  a hole  to  the  digit  0.  For 
example,  if  the  numbers  indicated  in  (3),  (4)  and  (5)  followed  in  a 
sequence  in  an  input  program,  then  the  corresponding  punched  card 
would  have  a portion  like  that  shown  in  Fig.  204  (check  this).  The 
holes  are  punched  beforehand  on  a special  perforator  that  is  not 
connected  with  the  computer  and  resembles  somewhat  the  ordinary 
cash  register  in  a shop. 

A large  program  is  written  on  a whole  stack  of  cards,  so  if  an  error 
is  detected  in  the  program  or  if  the  input  data  are  changed,  then  any 
card  can  be  replaced  with  ease. 
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Every  storage  location  in  the  memory  unit  of  the  machine  is 
a set  of  code  elements,  each  of  which  can  be  in  one  of  two  states.  This 
is  for  storing  numbers  and  processing  them  as  the  computations  pro- 
ceed. The  number  of  elements  in  a set  is  equal  to  the  number  of  orders 
(digit  positions)  in  the  storage  location,  and  each  of  the  states  of  an 
element  corresponds  to  the  value  0 or  1 in  the  appropriate  order. 
Such  code  elements  come  in  different  types,  but  they  all  have  to 
possess  two  properties:  inertialessness , which  means  the  instantaneous 
transition  from  one  state  to  the  other,  and  stability , which  means  the 
capability  of  remaining  in  a given  state  for  an  indefinitely  long  time 
in  the  absence  of  a transition  signal.  For  the  code  elements,  use  is 
made  of  magnetic  ferrite  cores  that  change  the  direction  of  the  magne- 
tic field  when  a current  flows  in  the  winding,  magnetized  and  unmag- 
netized portions  on  a magnetic  drum  or  magnetic  tape  (their  operation 
is  much  like  that  of  an  ordinary  tape  recorder),  and  so  forth. 

Numbers  are  transferred  from  memory  to  the  arithmetic  unit 
and  back  again  through  a system  of  channels  (there  may  be  as  many 
channels  as  there  are  digit  positions  in  a location,  in  which  case  wfe 
have  a parallel-mode  machine ). 

The  more  locations  there  are  in  the  memory  unit,  the  more  infor- 
mation that  can  be  entered  in  the  computer  and  the  greater  the  flexi- 
bility of  operation.  For  this  reason,  many  computers  have  their  me- 
mory capacity  extended  by  an  external  memory  unit  which  is  usually 
in  the  form  of  a magnetic  tape ; a special  device  acting  on  instructions 
provided  for  in  advance  can  extract  the  numbers  from  the  external 
memory  unit  and  transfer  them  to  the  internal  memory  unit  or  vice 
versa,  naturally.  Whereas  the  number  of  locations  in  the  internal 
memory  unit  is  in  the  thousands,  the  number  of  locations  on  magnetic 
tape  can  run  into  millions.  Reading  numbers  from  the  tape  and  writing 
them  back  on  the  tape  is  slower  than  the  same  processes  in  the  inter- 
nal memory  because  winding  the  tape  takes  time. 

If  the  machine  is  processing  information  oth  r than  numbers 
(say,  words),  then  it  has  to  be  coded  by  means  of  binary  digits.  Then 
a device  has  to  be  provided  for  that  can  decode  the  result,  that  is, 
return  it  to  the  natural  form  of  the  original  information. 

The  computer  performs  operations  on  the  numbe  rs  stored  ill 
the  memory  unit  by  means  of  instructions  that  are  entered  in  the 
memory  unit  prior  to  operation  and  have  the  same  type  of  notation 
as  numbers.  We  will  consider  only  the  so-called  three-address  instruc- 
tions. Each  such  instruction  is  a storage  location  that  we  can  imaging 
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to  be  divided  into  four  parts  of  definite  length,  say  3,  9,  9,  and  9 
digits  each.  These  parts  accommodate: 

(1)  the  code  of  the  operation  that  is  to  be  performed; 

(2)  the  address  (number  of  the  storage  location)  of  the  first 
operand  (number  to  be  operated  on) ; 

(3)  the  address  of  the  second  operand; 

(4)  the  address  to  which  the  result  is  to  be  sent. 

All  locations  are  numbered  serially.  For  example,  if  the  memory 
unit  has  512  locations,  then  nine  binary  digits  is  just  enough  to  indi- 
cate the  address  (why?).  Suppose  the  operation  code  for  addition 
is  1 . Now  if  it  is  required  to  add  the  numbers  located  in  locations  27 1 
and  59  and  the  result  to  be  put  in  422,  then  with  our  storage  location 
divided  as  indicated  above  the  appropriate  instruction  will  be  in 
the  following  form  (verify  this): 

001  100001111  000111011  110100110 


parts : 1st  2nd  3rd  4th  (6) 

After  this  instruction  has  been  carried  out,  all  storage  locations,  includ- 
ing 271  and  59,  will  remain  unchanged  (as  they  were  before),  with 
the  exception  of  422,  in  which  the  sum  of  the  numbers  at  the  addresses 
27 1 and  59  appear  in  place  of  what  location  422  had  before. 

The  instruction  (6)  is  punched  on  a card  and  is  stored  in  one  of 
the  storage  locations  of  the  memory  unit  just  exactly  like  the  num- 
ber 0.01 1000011 1 100011 101 1 1 12  - 2-110’  would  be.  The  difference  is 
evident  only  when  the  machine  is  in  operation,  since  if  the  program 
has  been  properly  written,  signals  in  the  form  of  instructions  enter 
the  control  unit  only  from  those  locations  containing  instructions. 

Electronic  circuitry  provides  for  the  operation  of  the  computer, 
including  the  operations  designated  by  the  instructions.  These  circuits 
transmit  electric  signals  according  to  specific  rules.  In  the  early  days 
of  computers,  these  circuits  were  based  on  the  use  of  electron  tubes 
(triodes  and  pentodes),  then  later  on  semiconductors  (transistors), 
and  integrated  circuits.  The  basic  types  of  signal  converters  are  shown 
in  Fig.  205.  The  first  generates  a signal  (positive  voltage  of  a sufficient 
level)  at  the  output  B if  and  only  if  the  signal  at  the  input  A is  absent. 
The  second  generates  a signal  at  the  output  C if  and  only  if  signals 
have  been  fed  to  both  inputs  A and  Bt  and  the  third  generates  a 
signal  when  a signal  has  been  fed  to  at  least  one  input.  More  compli- 
cated converters  are  obtained  by  combining  these  elementary  ones. 
For  example,  check  to  see  that  the  circuit  shown  in  Fig.  206  realizes 
the  addition  table  (1)  if  a signal  represents  unity  and  absence  of  a 
signal,  zero.  The  multiplication  table  (2)  is  realized  by  the  elementary 
“and”  circuit. 

We  now  give  a rough  outline  of  how  a machine  operates.  At 
the  start,  the  program  (which  is  a definite  sequence  of  instructions) 
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and  the  initial  information  (the  operands,  or  numbers  to  be  operated 
on)  are  entered  in  the  memory  unit  by  means  of  punched  cards.  Then 
the  control  unit  accepts  the  contents  of  the  first  location  as  an  instruc- 
tion and,  in  accordance  with  this  instruction,  the  arithmetic  unit 
performs  the  indicated  operations.  Then  the  control  unit  accepts  the 
contents  of  the  second  location  as  an  instruction,  and  so  forth,  with 
the  exception  of  special  instructions  called  “transfer  of  control”, 
which  we  will  discuss  in  Sec.  15.4,  and  after  the  performance  of  which 
the  control  unit  does  not  proceed  to  the  next  location  but  to  the 
address  indicated  in  the  instruction.  Certain  instructions  provide 
for  printing  the  contents  of  a location,  but  after  the  execution  of 
such  an  instruction  the  control  unit  then  passes  to  the  next  location. 
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* (True,  if  a lot  of  printing  is  required,  the  work  is  slow,  since  printing 
takes  more  time  than  arithmetic  operations.)  This  continues  until 
the  control  unit  reaches  an  instruction  to  stop  the  computer,  which 
brings  the  computation  to  an  end.  There  is  also  an  emergency  stop 
which  stops  the  machine  if  the  operation  provided  fcr  by  the  program 
cannot  be  performed;  for  example,  if  in  the  course  of  a computation, 
a number  is  obtained  that  is  too  big  for  the  storage  element  (this 
is  called  overflow ).  A special  device  makes  it  possible  at  any  time, 
from  the  control  console,  to  check  the  contents  of  any  location  and 
also  to  enter  accessory  information. 

Exercises 

1.  Write  the  following  base-10  numbers  in  binary  (base-2):  9999, 

-1/3,  2.75. 

2.  Write  the  same  numbers,  using  formula  (3)  as  a pattern. 

15.4  Programming 

We  now  give  a few  simple  examples  to  illustrate  the  basic  principles 
of  programming  a computer.  It  should  be  noted  that  writing  a serious 
program  is  a very  responsible  undertaking  and  often  requires  a large 
amount  of  time  and  skill.  For  the  sake  of  simplicity  we  will  write  the 
sign  of  the  operation  instead  of  the  operation  code  (for  instance,  the  + 
sign  instead  of  the  add  code).  ' 

Suppose  it  is  required  to  find  the  solution  of  a system  of  equations 
of  the  first  degree  (system  (6)  of  Ch.  8),  assuming  the  values  of  the 
numbers  ait  bt,  dt  to  be  known.  The  solution,  as  we  know,  can  be 
found  by  formulas  (7)  of  Ch.  8. 

We  will  place  the  instructions  in  storage  locations  labelled  1,2,  ... ; 
we  do  not  yet  know  how  many  instructions  will  be  needed.  The  six 
input  parameters  alt  blf  a2,  b2,  dv  d2  will  be  located,  respectively,  in 
storage  elements  labelled  a + 1,  a + 2,  a + 3,  a+4,  a + 5,  a + 6. 
The  value  of  a will  be  determined  later  on.  Several  more  locations 
will  be  used  for  storing  intermediate  results.  It  does  not  matter  what 
the  state  of  these  locations  was  prior  to  starting  our  computation 
because  any  previous  writing  in  a location  is  automatically  erased 
when  a new  entry  is  made.  The  remaining  storage  elements  of  the 
memory  unit  are  not  used  in  the  writing  and  performance  of  this 
program.  We  begin  by  calculating  the  value  of  d1b2  — bxd2.  To  compute 
dfb 2 we  need  the  instruction 

(1)  X a + 5 a + 4 a + 7 

Here  we  indicate  the  number  of  the  instruction  though  actually  it 
is  never  punched  in  a card.  When  this  instruction  is  executed,  the 
number  dyb2  appears  in  (a  + 7).  The  next  instruction  is 

(2)  x a + 2 a + 6 a + 8 
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and  after  its  execution  the  number  bxd2  appears  in  (a  + 8).  Next, 
subtract  from  the  number  d1b2  (location  (a  + 7))  the  number  bxd2 
(location  (a  + 8)).  Since  these  numbers  will  not  be  needed  any  more, 
the  result  can  be  put  in  (a  + 7)  via  the  instruction 

(3)  — a + 7 a + 8 a + 7 

Of  course  we  could  have  made  use  of  the  location  (a  + 9),  but  in 
more  complicated  programs  one  has  to  save  on  storage  space. 

In  similar  fashion  we  find  the  denominator  axb2  — bxa2 


(4) 

X a + 1 

a + 4 

a+8. 

(5) 

<s 

+ 

8 

X 

a + 3 

a + 9, 

(6) 

— a + 8 

a + 9 

a + 8 

Now  take  the  numerator  located  at  (a  + 7)  and  divide  it  by 
the  denominator  located  at  (a  + 8),  and  then  print  the  result: 

(7)  : a + 7 a + 8 a + 7 

(8)  print  a + 7 

On  the  last  instruction,  the  machine  prints  the  contents  of  the  storage 
element,  i.e.,  the  value  of  x , and  then  proceeds  to  the  next  instruction. 

We  compute  y in  similar  fashion,  having  in  view  that  the  deno- 
minator for  y has  already  been  computed  in  (a  + 8) : 


(9) 

X 

a 

+ 

1 

a + 

6 

« + 

7, 

(10) 

X 

a 

+ 

5 

a + 

3 

<*  + 

9, 

(11) 

— 

a 

+ 

7 

« + 

9 

« + 

7, 

(12) 

; 

a 

+ 

7 

a + 

8 

« + 

7, 

(13) 

print 

a 

+ 

7, 

(14) 

stop 

The  machine  is  stopped  on  the  14th  instruction.  Thus,  our  pro- 
gram contains  14  instructions  and  so  we  can  take  a = 14.  Then 
the  entire  program  will  occupy  20  storage  locations  and  will  look 
like  this: 


(1) 

X 

19 

18 

21, 

(2) 

X 

16 

20 

22, 

(3) 

— 

21 

22 

21, 

(4) 

X 

15 

18 

22, 

(5) 

X 

16 

17 

23, 

(6) 

— 

22 

23 

22, 

(7) 

; 

21 

22 

21, 

(8) 

print  21, 

(9) 

X 

15 

20 

21, 
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(10) 

X 

19 

17 

23, 

(11) 

— 

21 

23 

21, 

(12) 

21 

22 

21, 

(13) 

print 

21, 

(H) 

stop, 

(15) 

(16) 

&i> 

(17) 

a2> 

(18) 

&2> 

(19) 

d» 

(20) 

d2 

Thus,  all  the  intermediate  results  in  our  program  require  three 
intermediate-result  locations  (Nos.  21,  22,  and  23).  The  only  thing 
left  to  do  now  is  to  punch  this  program  and  start  the  machine.  (Actual- 
ly, a few  more  instructions  are  required  that  are  of  no  particular 
importance  to  us ; for  instance,  the  instruction  to  enter  the  program 
the  memory  unit  from  the  cards,  and  so  on.) 

In  this  example,  the  number  of  instructions  in  the  program  was 
equal  to  the  number  of  operations.  But  modern  electronic  digital 
computers  were  created  mainly  for  computations  requiring  millions 
of  operations.  It  is  clearly  impossible  to  write  a separate  instruction 
for  every  operation  in  that  case.  Luckily,  in  such  a huge  volume  of 
computation  many  intermediate  computations  are  ordinarily  carried 
out  several  times  according  to  the  same  scheme.  Then  in  the  program 
we  can  provide  for  the  formation  of  loops , in  which  the  control  unit 
passes  through  one  and  the  same  portion  of  the  program  several  times. 
Loops  are  formed  with  the  aid  of  the  conditional  transfer  of  control 
instruction  (jump ) , which  looks  like  this : 

jump  Nx  N2  N3 

We  wrote  “jump”  but  actually,  of  course,  one  has  to  indicate  the 
code  of  the  operation.  There  are  different  versions  for  realizing  this 
instruction.  For  the  sake  of  definiteness,  we  will  assume  that  on  this 
instruction  the  machine  compares  the  contents  (A^)'  of  location  Nx 
and  the  contents  (N2)'  of  location  N2>  if  (Nx)'  < (iV2)',  then  the  con- 
trol unit  executes  the  next  instruction,  but  if  {Nx)'  ^ 
then  it  executes  the  instruction  of  location  N3,  in  both  cases  the 
contents  of  the  locations  remain  unchanged.  Say,  the  instruction 

jump  1 1 N3 

means  that  after  reading  it  the  control  unit  executes  the  instruction 
contained  in  location  N3.  We  can  regard  this  instruction  as  an  uncon- 
ditional transfer  of  control  instruction . 
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As  an  example  let  us  write  a program  for  printing  the  table  of 
reciprocals  of  the  500  natural  numbers  between  2001  and  2500. 
Without  the  conditional  transfer  instruction  the  program  would  be 
very  large,  whereas  actually  it  is  very  small.  We  put  the  number  2000 
in  location  (a  + 1),  the  number  1 in  (a  + 2),  and  the  number  2499  in 
(a  + 3)  (it  will  soon  become  clear  why  this  is  done).  Suppose  the  first 
instruction  is  of  the  form 

(!)  + «+  ! a-f2  a + 1 

On  execution  of  this  instruction,  the  number  2001  appears  in  location 
(a  + 1)  instead  of  2000,  we  compute  the  reciprocal  of  2001  and  print 
the  result: 

(2)  : a + 2 a 4“  1 a + 4, 

(3)  print  a + 4 

We  compare  the  number  in  (a  + 1)  with  2499  and,  since  it  has 
not  yet  exceeded  2499,  we  pass  again  to  the  first  instructions: 

(4)  jump  a + 3 a + 1 1 

Since  in  this  case  (a  + 3)'  = 2499  and  (a  + 1)'  = 2001,  on  read- 
ing the  fourth  instruction,  the  control  unit  goes  to  the  first  instruc- 
tion. Upon  executing  this  instruction,  the  number  2002  instead  of 
2001  appears  in  (a  + !)■  The  next  two  instructions  have  to  do  with 
computing  and  printing  the  reciprocal  of  2002.  On  the  fourth  instruc- 
tion the  machine  compares  2499  with  2002  and  again  goes  back  to 
the  first  instruction,  and  so  forth,  and  only  when,  after  adding  unity 
for  the  last  time,  the  number  2500  appears  in  location  (a  + 1)  and 
the  reciprocal  of  2500  has  been  printed  does  the  fourth  instruction 
let  the  control  unit  take  over,  for  then  we  have  (oc  + 3)'  = 2499, 
(a  + 1)'  = 2500.  The  machine  is  then  stopped: 

(5)  stop 

All  the  results  have  been  printed  out. 

Thus,  we  take  a = 5 and  the  complete  program  looks  like  this: 

(1)  + 6 7 6, 

(2)  : 7 6 9, 

(3)  print  9 

(4)  jump  8 6 1, 

(5)  stop, 

(6)  2000, 

(7)  1, 

(8)  2499 

We  needed  only  one  storage  location  (No.  9)  for  storing  interme- 
diate results,  but  in  the  process  of  operation  the  contents  of  the 
sixth  location  varied  from  2000  to  2500  at  intervals  of  1 . 
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Let  us  examine  a variant  in  which  at  the  end  of  the  operation 
period  the  reciprocals  are  not  printed  out  but  are  located  in  storage 
elements  of  the  memory  unit  with  number  labels  from  10  to  509 
(these  quantities  may  be  needed  for  subsequent  computations) : 

(1)  + 6 7 6, 

(2)  +'  3 9 3, 

(3)  : 7 6 9, 

(4)  jump  8 6 1, 

(5)  stop, 

(6)  2000, 

(?)  1, 

(8)  2499 

(9)  (1  in  the  last  digit  position  of  the  stor- 
age element,  the  other  positions  be- 
ing 0) 

Here,  an  auxiliary  number  (without  a quantitative  value)  is 
delivered  to  the  ninth  location  of  the  program;  it  serves  to  convert 
instructions,  which  is  a fundamentally  new  type  of  operation.  After 
the  first  execution  of  the  second  instruction  the  third  instruction 
accepted  by  the  arithmetic  unit  as  a number  takes  the  form:  7 6 10. 
(Note  that  addition  via  instructions  (1)  and  (2)  is  performed  by  diffe- 
rent rules;  this  was  indicated  by  a prime.  Think  this  over!)  For  this 
reason,  after  the  third  instruction  the  number  1:  2001  is  sent  to  the 
10th  location.  After  a second  execution  of  the  second  instruction  the 
third  instruction  becomes  7 6 11,  and  so  after  the  second  execution 
of  the  third  instruction  the  number  1 : 2002  is  sent  to  the  11th  loca- 
tion, and  so  on.  This  operation  of  changing  the  address  in  an  instruc- 
tion is  called  address  substitution. 

Thus,  instructions  can  be  automatically  converted  in  the  process 
of  operation  of  the  machine.  This  opens  up  fresh  opportunities  in  the 
use  of  electronic  digital  computers. 

The  conditional  transfer  of  control  instruction  is  also  used  without 
loops,  when  it  is  required  to  organize  program  splitting , which  is  the 
performing  of  distinct  sequences  of  operations  depending  on  circum- 
stances that  are  not  known  beforehand.  Suppose  that  in  a computa- 
tion a number  a is  to  appear  in  a location  labelled  N and  we  are  to 
leave  it  unchanged  if  a ^ 0 or  square  it  and  leave  the  result  in 
location  N if  a < 0.  Since  ordinarily  the  number  0 is  placed  in  the 
location  labelled  0,  the  required  operation  can  be  performed  by  appro- 
priately placing  the  following  instructions: 


( k ) jump  NO  k + 2> 

{k  + 1)  x . N N N 
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The  program  splitting  is  handled  automatically  so  that  we  do  not 
even  know  which  variant  has  been  accomplished  if  such  information 
has  not  been  provided  for. 

Finally,  let  us  examine  a program  in  which  the  total  number 
of  operations  has  not  been  provided  for  beforehand.  Suppose  we  have 
to  solve  the  cubic  equation 

x = 0. 1 x3  + 1 

by  the  iterative  method  (Sec.  1.3)  accurate  to  0.001,  beginning  with 
the  value  x0  = 0.  To  do  this,  place  the  number  0 in  (a  + 1)  (the 
appropriate  row  of  the  card  is  simply  not  punched),  the  number  0.1 
in  the  location  (a  + 2),  the  number  1 in  (a  + 3)  and  the  number 
0.001  in  (a  + 4).  We  will  place  the  successive  approximations  in 
storage  location  (a  + !)■  Computing  the  next  approximation  from  the 
previous  one  in  this  manner  is  accomplished  via  the  instructions: 

(1)  X a + 1 a + 1 a + 5, 

(2)  X a + 5 a + 1 oc-{-5, 

(3)  X a + 2 a -f  ^ a + 5, 

(4)  + a + 5 a + 3 a + 5 

Thus,  the  next  approximation  is  placed  in  location  (a  + 5). 
It  must  be  compared  with  the  preceding  one  located  at  (a  + 1)  and 
if  they  coincide  to  within  0.001,  then  we  place  the  next  approximation 
in  location  (a  + 1)  and  repeat  the  iteration.  If  the  approximations 
differ  by  less  than  0.001,  the  result  is  printed.  This  can  be  done  via 
the  instructions  (which  should  be  checked) 

(5)  |-|  a + 1 a + 5 a + 1 

(this  instruction  sends  the  number  | (a  + 1)'  — (a  + 5)'  | to  location 

(a  + *))> 

(6)  jump  0 a + 1 9, 

(7)  + a + 5 0 a+1, 

(8)  jump  1 1 1, 

(9)  print  a + 5 

(10)  stop 

Thus  we  can  set  a = 10  and  write  out  the  entire  program  in 
14  storage  locations.  By  this  program,  the  computer  will  perform 
iterations  until  the  last  two  approximations  coincide  to  within  the 
accuracy  indicated ; it  will  then  print  the  result  and  stop. 

Note  that  if  the  iteration  process  does  not  converge,  the  computer 
will  overflow  (that  is,  it  will  exceed  the  capacity  of  the  number  repre- 
sentation) or  it  will  lock  into  a loop,  in  which  case  the  machine  is  not 
able  to  stop  by  itself  and  requires  interference  from  the  control  panel. 

j-  Programs  for  more  complicated  problems  may  be  quite  extensive, 
but  they  frequently  include  simpler  (and  often  repeated)  problems 
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like,  say,  computing  the  sine  of  the  relevant  quantities,  and 
the  like. 

These  simpler  problems  are  solved  by  means  of  standard  subpro- 
grams, which  are  merely  a set  of  instructions  compiled  beforehand 
and  residing  in  specific  storage  locations  of  the  internal  memory 
of  the  computer  or  in  the  external  memory  section.  In  the  writing 
of  a program  of  considerable  complexity,  such  standard  subprograms 
are  brought  into  the  computation  on  a single  instruction. 

In  recent  years,  the  work  of  programming  a computer  has  been 
simplified  greatly  by  the  development  of  several  universal  machine 
languages , which  make  it  possible  to  write  the  program  in  more  con- 
ventional mathematical  terms  than  what  we  have  just  described. 
A program,  written  in  such  a language  does  not  depend  on  the  type  of 
computer  and  is  printed  on  a device  very  much  like  a typewriter. 
Then  a special  translator  (a  system  having  a number  of  inputs  and 
outputs)  automatically  converts  the  given  program  to  that  of  the 
machine  to  be  used  in  the  computation. 

Two  languages  of  this  kind  are  in  wide  use:  ALGOL  (from 
ALGOrithmic  Language)  and  FORTRAN  (meaning  FORmula 
TRANslator).  They  are  of  considerable  aid  in  making  electronic 
digital  computers  more  accessible  to  the  scientific  community. 

Exercises 

1.  Write  a program  for  computing  a table  of  the  squares  of  the 
natural  numbers  from  1 to  1000  inclusive,  with  the  values  of  n 
and  n2  alternating  in  the  sequence:  1,  1,  2,  4,  3,  9,  ...  . 

2.  Write  a program  for  computing  the  sum  J/5  -f-  j/6  + ...  + V 100 
(this  sum  was  discussed  in  Sec.  1.2).  Let  the  extraction  of  the 
square  root  be  a separate  operation  indicated  by  the  code  f . 

i 

3.  Program  the  computation  of  the  integral  \ dx  by  Simpson's 

J 1 + * 

o 

formula  (Sec.  1.1)  with  the  interval  of  integration  partitioned 
into  100  parts;  into  1000  parts. 

4.  Write  a program  for  integrating  the  differential  equation  y9  ~ 

= x1  — y2  over  the  interval  — 1 ^ ^ 0 with  the  initial  con- 

dition y(— 1)  =0  by  the  recalculation  method  described  in 
Sec.  8.7  (cf.  Table  5)  with  the  integration  step  0.001,  and  print  the 
results  at  intervals  of  0.1.  Use  the  operation  { } to  indicate  the 

fractional  part  of  a number  ({5.3}  = 0.3,  {—  5.3}  = 0.7,  {5}  = 0). 

15.5  Use  computers! 

From  Sec.  15.4  it  is  clear  that  programming  a digital  computer 
is  not  difficult  in  simple  cases.  Programming  is  even  being  taught  in 
secondary  school.  In  many  cases  computers  have  made  it  possible 
to  increase  accuracy  and  speed  of  computations  by  several  orders 
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of  magnitude;  certain  complicated  problems  have  for  the  first  time 
entered  the  realm  of  solvability.  (To  get  an  idea  of  the  range  of  these 
operations,  recall  that  small-scale  digital  computers  perform  several 
thousand  arithmetic  operations  per  second,  medium-sized  computers, 
tens  of  thousands  per  second,  and  large  ones,  like  the  Soviet  BESM-6, 
do  a million  operation  per  second.) 

The  only  barrier  left  to  the  widespread  use  of  computers  would 
seem  to  be  psychological.  To  put  it  crudely,  many  research  workers  are 
simply  frightened:  they  fear  computers  somewhat  along  the  same 
lines  that  past  generations  of  mathematicians  were  overawed  by 
integrals  that  could  not  be  evaluated,  transcendental  finite  equa- 
tions, differential  equations  not  solvable  by  quadratures,  and  the 
like.  Instead  of  fundamentally  altering  the  approach  to  such  inte- 
grals and  equations,  scientists  persistently  sought  to  find  new  cases 
of  integrability.  All  this  appreciably  restricted  the  possibilities  of 
mathematical  applications.  Today  the  situation  is  similar  with  regard 
to  electronic  digital  computers. 

One  should  not  of  course  veer  to  the  other  extreme  and  think 
that  computers  make  superfluous  all  analytic  (exact,  approximate, 
asymptotic)  formulas  and  methods,  “hand”  computation  aided  by 
the  electric  desk  calculator  and  the  slide  rule,  and  pencil-and-paper 
calculations.  Analytic  solutions,  when  they  are  possible,  often  have 
the  inestimable  advantage  of  being  compact,  particularly  if  the 
problem  includes  parameters  or  the  solution  is  obtained  as  a function 
of  several  independent  variables.  Asymptotic  formulas  are  effective 
in  cases  where  the  use  of  numerical  methods  becomes  involved.  Hand 
calculation  is  the  most  mobile,  so  to  say,  and  is  particularly  well 
adapted  to  rough  estimates,  which  should  be  resorted  to  as  often 
as  possible,  even  when  preparing  for  very  extensive  computations. 
Computers  are  not  meant  to  replace  other  fruitful  mathematical 
methods,  but  to  combine  with  them  and  thus  substantially  expand 
the  range  of  application  of  mathematics. 

When  a computer  is  used  to  solve  a problem  that  has  alreadj^ 
been  stated  in  mathematical  terms,  the  point  of  greatest  respon- 
sibility is  usually  that  of  preparing  the  problem  for  the  programming 
process.  It  is  often  quite  a job  to  choose  a method  capable  of  yield- 
ing a reliable  result  and  yet  one  that  is  within  the  capacity  of  the 
computer,  which  is  powerful  but  not  all-powerful.  In  this  prepara- 
tory stage,  one  frequently  has  to  recast  one's  “mathematical  think- 
ing” that  was  reared  on  hand  methods. 

For  example,  in  the  premachine  era  it  was  taken  for  granted 
that  if  it  is  possible  in  a nonlinear  differential  equation  to  lower 
the  order  of  the  equation  by  means  of  a substitution,  then  this  should 
be  done.  Say,  to  solve  the  equation 

y"  =f(y>  y')  (y  = M*)) 


(7) 
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the  advised  procedure  was : consider  the  relationship  y'  = p as  a 
function  of  y,  whence  y " = — = — — = p — and  equation  (7) 

dx  dy  dxdy 


takes  the  form 


P y-  =/b'.  P)  (P  = P(y))  (8) 

dy 

which  is  an  equation  of  the  first  order.  If  we  are  able  to  integrate 
it,  that  is,  to  find  the  general  solution  p = <p(y,  C2),  then  we  write 


P = T = 

dx 


whence 


— — — = dx  and  [ — — — ■ = 

<?(y>  ci)  3 9 iy>  ci) 


x + C2 


(9) 


This  procedure  sometimes  achieves  its  aim,  particularly  in 
an  analytic  investigation  of  the  solution,  but  for  a numerical  solu- 
tion, especially  using  a computer,  it  ordinarily  proves  inadvisable, 
for  we  have  to  solve  (8)  numerically  (and  if  it  is  solved  in  quadra- 
tures, then  we  have  to  compute  the  appropriate  integrals),  and 
then  compute  the  integral  (9),  and,  finally,  invert  the  resulting 
relationship  x(y).  But  it  is  much  simpler  — both  for  compiling  tables 
of  the  values  of  a particular  solution  and  for  complete  tables  of 
the  general  solution  — to  integrate  equation  (7)  directly  numeri- 
cally, without  lowering  the  order  of  the  equation.  That  is  exactly 
what  should  be  done  on  a computer. 

Which  means  that  one  should  be  capable,  at  least  in  a crude 
manner,  to  estimate  the  volume  of  computation  required  to  bring 
the  solution  of  the  problem  to  completion. 

Here  is  another  example.  The  very  statement  of  the  problem 
in  Secs.  1.2  and  3.4  on  approximating  sums  by  means  of  integrals 
was  of  course  due  to  the  requirements  of  hand  computation.  In 
hand  calculation,  the  indicated  approximate  formulas  and  methods 
for  their  refinement  are  extremely  useful.  But  when  computers 
are  used,  direct  summing  of  the  terms  turns  out  to  be  more  effective 
in  most  cases. 

But  here  too  one  should  not  act  unthinkingly.  Suppose  we 
want  to  find  the  sum  of  the  infinite  series 


S — 77  + rj  + ■•■  + — + 
l2  22  n2 


Regarding  the  computer  as  all-powerful,  we  might  attempt  to  calcu- 
late and  add  terms  of  the  series  until  we  get  a machine  zero,  that 
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is,  until  they  become  zero  after  the  rounding  that  is  necessary  to 
represent  the  number  in  the  storage  location,  after  which  the  partial 
sums  of  the  series  will  cease  to  increase.  But  for  the  parameters 
of  the  computer  given  in  Sec.  15.3  (see  (4),  (5)),  this  will  occur  when 

— < 2~55,  or  n > 227,5  « 2 * 103 

n2 

As  may  be  seen  from  Sec.  3.4,  the  error  is  then  close  to  \jnt  or  eight 
orders  of  magnitude  higher  than  the  last  terms.  Since  adding  a term 
of  a series  to  a partial  sum  requires  several  operations,  for  a computer 
operating  at  20  000  operations  per  second  it  will  take  5 hours  of 
time,  300  roubles  of  government  money  and  will  incur  the  fiercest 
criticism  of  one's  colleague,  especially  when  they  find  out  what 
the  computing  was  all  about.  (Now  suppose  the  length  of  a storage 
location  to  be  40  instead  of  30  digits,  which  is  closer  to  actual  numbers, 
the  operation  period  and  cost  of  the  computation  will  rise  by  a factor 
of  32.) 

Of  course  computations  should  not  be  handled  that  way.  The 
result  can  be  obtained  much  faster  and  more  accurately  if,  say, 
we  sum  the  first  1000  terms  of  the  series  (which  takes  less  than 
a second)  and  then  replace  the  remainder  by  an  integral  via  the 
method  of  Sec.  3.4.  The  precision  of. the  result  can  be  found  by  repeat- 
ing the  computation  for  2000  terms.  A still  better  way  is  to  take 
a reference  book  (say  [6],  where  you  will  find  that  5 = Tv2/ 6),  where 
the  value  of  n2  can  be  found  very  precisely  in  tables.  (True,  it  was 
simply  our  luck  in  this  case  because  numerical  series  rarely  develop 
into  finite  expressions  in  terms  of  known  constants.) 

Computations  done  on  electronic  digital  computers  are  funda- 
mentally discrete.  For  this  reason,  when  one  has  a digital  computer 
in  mind,  he  must  formulate  the  problem  as  a differential  or  integral 
equation,  all  the  time  bearing  in  mind  how  the  problem  and  the 
method  of  its  solution  will  appear  in  discrete  terms.  For  equations 
involving  more  than  one  independent  variable  this  gives  rise  to 
a diversity  of  complications,  many  of  which  have  not  yet  been 
overcome. 

Problems  involving  parameters  require  special  attention.  For 
example,  suppose  we  want  to  compile  a table  that  could  be  used 
to  solve  the  complete  cubic  equation 

ax ® -f-  hx~  -j-  cx  -j-  d = 0 (10) 

If  we  assume  that  each  of  the  parameters  a,  b,  c , d can  take  on  50 
values,  which  isn't  many,  then  we  get  a total  of  504  « 6 • 106  com- 
binations of  these  values.  An  average-size  computer  can  handle 
the  job  in  about  one  month  of  continuous  operation  with  most  of 
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the  time  devoted  to  printing.  This  will  leave  you  with  200  kilome- 
tres of  tape  weighing  two  tons,  and  you  will  rue  the  day  you  got 
the  idea  of  such  a problem. 

Generally,  most  people  newly  introduced  to  computers  and 
amazed  at  what  they  can  do  try  to  obtain  as  many  numerical  results 
as  possible,  they  are  guided  by  the  naive  principle  that  “the  more 
figures,  the  more  information  and,  hence,  the  greater  utility”.  But 
these  people  are  then  flooded  by  a tide  of  numbers,  and  the  fresh 
task  of  extricating  something  of  value  often  proves  more  compli- 
cated than  the  original  problem.  All  of  which  is  reminiscent  of  the 
medieval  legend  of  the  magician  and  his  apprentice,  who  in  the 
absence  of  his  mentor  called  forth  a jinn  and  caused  him  to  haul 
water  but  was  unable  to  stop  him  and  almost  drowned.  Hence  so 
important  and  timely  the  frequently  repeated  statement  by  the 
prominent  computer  expert  R.  Hamming  [8]  "...  it  is  a good  general 
rule  to  begin  a problem  in  computation  with  a searching  examination  of 
‘What  are  we  going  to  do  with  the  answers?’” 

Let  us  return  to  equation  (10).  Actually  the  situation  is  not 
so  hopeless  after  all.  First  divide  both  sides  by  a: 

#3+  — #2+  — # + — ■ = 0 
a a a 


Then  make  the  substitution  x = y + a choosing  a so  as  to  eliminate 

the  term  with  the  unknown  squared.  This  gives  us  a = — — and 

3a 

we  get  the  equation  (check  this) 


V3  + py  + q = 0,  where  p = - - — q 

a 3a * 


d be 
a 3 a2 


2fc3 
27  a3 


Finally,  make  the  substitution  y = fiz  to  get 

P3*3  + pflz  + q = 0,  or  z3  + ^z  + ^ = 0 

P P 

Choosing  (S  = fq,  we  arrive  at  the  equation 

z3  + rz  + 1 — 0,  where  r = pq~2j 3 (11) 

The  idea  behind  these  manipulations  is  clear:  we  successively 
reduced  the  number  of  parameters  till  we  reached  (11)  with  only 
one,  r,  which  is  a combination  of  the  initial  parameters.  To  compute  r 
from  these  parameters  requires  simple  arithmetic  operations  and 
a table  of  cube  roots.  Using  a computer,  it  is  now  easy  to  set  up 
a table  of  values  of  the  solutions  of  equation  (11)  depending  on  the 
single  parameter  r,  even  if  we  assign  5000  values  to  r instead  of  50. 
Using  the  solution  z,  we  easily  find  the  solution  x: 

x — $z  + a = [(27 a3d  — 9 abc  -f  263)1/3  z — b]j2>a 
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The  solution,  as  we  see,  comes  out  in  the  form  of  two  tables 
of  one  entry  each  (the  table  of  cube  roots  and  z(r))t  which  of  course 
is  incomparably  simpler  than  one  table  of  four  entries. 

The  problem  of  reducing  the  number  of  parameters  can  also 
arise  in  the  solution  of  a differential  equation  containing  parame- 
ters. A case  of  this  kind  was  considered  in  Sec.  8.10  where  a problem 
containing  five  parameters  (if  one  takes  into  account  that  the  solu- 
tion itself  is  a function  of  t,  we  find  a table  with  six  entries  (!)  is 
required  to  represent  the  solution)  was  reduced  to  the  problem  (71) 
of  Ch.  8 with  two  parameters  ((72)  of  Ch  .8)  with  the  aid  of  similarity 
transformations. 

By  way  of  another  illustration,  consider  the  solution  of  equa- 
tion (7)  without  parameters  for  arbitrary  initial  conditions: 

y(*o)  = y o.  /(* o)  = y'o 

It  would  appear  at  first  glance  that  the  solution  requires  a table 
of  four  entries  for  its  representation,  since  the  parameters  x0>  y0 , y'0 
and  the  independent  variable  x may  assume  arbitrary  values. 
But  the  number  of  entries  can  easily  be  reduced  to  three.  Indeed, 
equation  (7)  does  not  contain  x , which  means  it  is  invariant  under 
the  transformation  x x + constant.  For  this  reason,  along  with 
the  solution  y = g(x),  (7)  has  the  solution  y — g{x  + C)  for  any  C. 
But  this  means  that  the  solution  y does  not  depend  on  # and 
x0  separately  but  on  the  combination  x — x0: 

y = <K*  — *o.  y0.  y'o)  (12) 

And  so  it  suffices  to  compile  a table  of  three  entries  y = ${xtyQ , yr0) 
to  solve  equation  (7)  for  x0  = 0,  then  for  any  x0  we  can  find  the  solu- 
tion from  (12). 

In  this  last  example,  the  number  of  entries  in  the  complete 
table  of  the  solution  was  obtained  by  indicating  the  group  of  trans- 
formations x ->  x + C under  which  the  equation  at  hand  remains 
invariant  (transformation  groups  are  discussed  in  Sec.  14.4).  This 
also  means  that  the  set  of  graphs  of  all  solutions  is  invariant  under 
translations  along  the  #-axis:  under  a translation,  each  such  graph 
passes  into  the  graph  of  some  other  solution.  The  interesting  thing 
here  is  that  we  arrive  at  this  conclusion  by  analyzing  the  equation 
itself  and  not  its  solutions ! 

It  turns  out  that  a knowledge  of  the  continuous  group  of  transfor- 
mations (i.e.,  transformations  dependent  on  one  or  several  continuous 
parameters)  that  leave  the  given  differential  equation  invariant 
always  permits  reducing  the  number  of  entries  in  the  complete  table 
of  its  solution.  Unfortunately,  it  is  not  always  possible  by  far  to 
detect  such  groups,  but  sometimes  they  are  suggested  by  physical 
reasoning  (as  the  time-shift  group  for  an  autonomous  system). 
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Do  not  grudge  the  time  taken  to  reduce  the  number  of  para- 
meters in  a problem ! First  of  all,  it  cuts  the  overall  amount  of  com- 
putation and  makes  the  results  more  surveyable:  for  example, 
if  a parameter  takes  on  20  values,  getting  rid  of  it  will  cut  compu- 
tations and  the  extent  of  the  final  table  by  a factor  of  20.  Secondly, 
the  combinations  of  parameters  obtained  after  transforming  the 
problem  are  often  profoundly  meaningful  from  a physical  point 
of  view. 

At  times,  the  use  of  computers  suggests  the  advisability  of 
fundamentally  changing  the  conventional  procedure  for  problem 
solving.  For  example,  in  computing  the  integral 

1 = J f (M)  d£l 

w 

over  a region  (£})  of  high  dimensionality  and  of  complex  shape  it  is 
better  to  replace  quadrature  formulas  of  the  Simpson  type  (Sec. 
1.1)  by  a method  based  on  approaching  the  integral  as  the  arith- 
metic mean  of  the  function  value.  Say,  in  the  region  (Q)  we  choose 
at  random  points  Mlt  M2>  ...,  MN  (computers  have  randomizing 
devices  that  generate  random  numbers  which  are  taken  for  the 
coordinates  of  these  points)  and  then  put 

I ~ -UiM,)  +f(M 2)  + ...  +f(MN)] 

iv 

For  large  N — and  the  computer  can  see  to  it  that  N is  large  enough 
— the  accuracy  of  the  formula  is  quite  acceptable.  This  proce- 
dure is  one  of  the  simplest  instances  of  a general,  specifically  ma- 
chine, method  called  the  Monte  Carlo  method  (named  after  the  city 
of  gambling  fame,  the  business  of  which  is  completely  based  on 
independent  random  trials)  in  which  the  desired  quantity  is  repre- 
sented as  the  mean  value  of  a certain  random  variable  (Sec.  13.7), 
and  this  mean  value  is  replaced  by  the  arithmetic  mean  of  reali- 
zation of  this  variable  in  large  numbers  of  independent  trials. 

A fundamentally  important  problem  when  working  with  com- 
puters is  the  effect  of  rounding  off  errors.  When  we  have  long  chains 
of  computations  and  succeeding  steps  rely  all  the  time  on  preceding 
results,  rounding  errors  can  build  up  to  a point  where  we  will  be 
dealing  solely  with  errors.  , 

Here  is  a striking  example  of  this  effect  (it  is  given  in  [1]).  In 
evaluating  the  integral 

i 

In  = Xncx  dx  (n  = 0,1,  2,...)  (13) 


0 
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it  is  easy  to  establish  by  integration  by  parts  that 

/„  = 1 - (n  = 1,  2,  3,  ...)  (14) 

Besides, 

1 

/0=  ~{ex  dx  = -(e-  1)  = 1 - 1 = 0.632  (15) 

fi  J e e 

o 

Formulas  (14)  and  (15)  permit  computing  successively  Ix  — 1 — 
— l/0,  I2  = 1 — 2/^  and  so  on.  The  results  I\  obtained  in  com- 
puting these  values  to  three  decimal  places  are: 

n 0 1 2 3 4 5 6 7 8 

II  0.632  0.368  0.264  0.208  0.168  0.160  0.040  0.720  -4.760 

The  results  are  clearly  absurd,  since  from  (13)  it  is  at  once  apparent 
that  I0>  Ix>  I2>  ...  >0. 

The  root  of  the  trouble  is  clear:  in  the  computation  of  I\  the  ori- 
ginal rounding  error  of  I0  is  multiplied  by  1*2-3 ...  w,  and  since  the 
exact  value  of  In  is  bounded  (and  even  tends  to  zero  as  n oo),  the 
relative  error  increases  swiftly.  Computations  with  large  numbers 
of  digits,  which  is  typical  of  computers,  helped  matters  somewhat 
but  not  for  long.  Working  to  9 significant  digits,  we  reach  nonsense 
results  from  n = 14  on.  The  benefit  of  a repeated  computation  with 
a different  number  of  decimal  digits  is  that  we  learn  how  reliable 
our  computations  were. 

Now  it  is  not  hard  to  rearrange  our  material  so  that  the  errors 
fall  off  instead  of  building  up.  Replace  n by  n + 1 and  rewrite  (14)  as 

n + 1 

Then  put  a “starting”  In  equal  to  zero  and  compute  In  from  large  n 
to  small  n.  For  example,  put  /]0  — 0 and  we  get 

w9  87  6 54  32  10 

Ilnl  0.100  0.100  0.112  0.127  0.146  0.171  0.207  0.264  0.368  0.632 

More  exact  computations  show  that  /9  = 0.092,  /8  = 0.101,  where- 
as all  the  other  figures  are  correct. 

When  computing  integrals  and  solving  differential  equations 
one  often  gets  into  a paradoxical  situation  due  to  the  effects  of  round- 
ing errors:  to  improve  accuracy  we  refine  the  partition,  but  if  the 
method  used  is  unstable,  then  because  of  the  increased  number 
of  operations  the  rounding  errors  come  to  the  fore  and  the  overall 
error  increases.  This  must  be  kept  in  mind  when  working  with  com- 
puters. 
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Yes,  definitely  get  into  the  habit  of  using  computers.  Do  the  com- 
putations yourself,  don't  entrust  this  important  work  to  others.  A 
computing  expert  who  is  not  sufficiently  acquainted  with  the  specific 
nature  of  the  problem  (provided,  of  course,  that  he  is  not  a co-author) 
will  do  a lot  of  extra  work,  may  find  it  difficult  to  reorganize  himself 
or  back  out  of  a dead-end  alley,  or  resort  to  intuition  based  on  the 
physics  of  the  situation  or  to  rough  estimates  — and  all  this  will 
unavoidably  affect  the  result. 

Finally,  it  is  well  worth  remembering  the  basic  proposition 
of  Hamming  that  is  often  repeated  in  his  book  [8]: 

“The  purpose  of  computing  is  insight,  not  numbers.” 

All  results  must  be  physically  meaningful,  they  must  exhibit 
trends,  influences,  and  the  crowning  achievement  of  the  job  must 
be  the  creation  of  an  approximate  interpolation  theory  with  coef- 
ficients obtained  from  precise  calculations.  This  ideal  may  not  always 
be  attainable,  but  let  us  at  least  strive  to  attain  it ! 

Exercise 

Indicate  the  group  of  transformations  that  leave  invariant 

the  equation  y = f(x,  y") ; also  indicate  the  number  of  en- 
tries needed  to  form  a table  of  its  general  solution. 

ANSWERS  AND  SOLUTIONS 


Sec.  15.3 

1.  10011100001111,  — 1/11  = -0.010101...,  10.11 
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Sec.  15.4 

1.  Here  and  henceforth  only  one  of  several  possible  variants  is 
indicated: 

(1)  + 9 7 9. 

(2)  print  9, 

(3)  X 9 9 10, 

(4)  print  10, 

(5)  jump  8 9 1, 

(6)  stop, 

(1)  1, 

(8)  999, 

(9)  0 


Answers  and  solutions 
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(1) 

+ 

9 

7 

9, 

(2) 

r 

9 

11, 

(3) 

+ 

10 

11 

10, 

(4) 

jump 

8 

9 

1. 

(5) 

print 

10, 

(6) 

stop. 

(7) 

1, 

(8) 

99, 

(9) 

4, 

(10) 

0 

(1) 

+ 

18 

23 

18, 

(2) 

I 

19 

18 

26, 

(3) 

i 

20 

26 

20, 

(4) 

+ 

18 

23 

18, 

(5) 

! 

19 

18 

26, 

(6) 

-U 

21 

26 

21, 

(7) 

jump 

24 

18 

1, 

(8) 

— 

21 

26 

21, 

(9) 

X 

21 

16 

21, 

(10) 

X 

20 

17 

20, 

(11) 

+ 

22 

21 

22, 

(12) 

+ 

22 

20 

22, 

(13) 

22 

25 

22, 

(14) 

print 

22, 

(15) 

stop, 

(16) 

2, 

(17) 

4, 

(18) 

1, 

(19) 

1, 

(20) 

o, 

(21) 

0, 

(22) 

1.5, 

(23) 

0.0 1, 

(24) 

1.99, 

(25) 

300 

In  the  second  variant,  the  last  three  initial  parameters 
have  to  be  replaced,  respectively,  by  0.001,  1.999,  and  3000; 
the  computer  handles  the  rest!  It  is  even  easy  to  write  a pro- 
gram in  which  replacing  the  step  would  only  require  replacing 
a single  initial  parameter. 

4.  ( 1 ) print  22, 

(2)  print  23, 

(3)  X 23  23  29, 

(4)  — 24  29  29, 

(5)  X 29  25  30, 
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(6)  + 23  30  30, 

(7)  + 22  25  22, 

(8)  X 22  22  24, 

(9)  - 30  30  30, 

(10)  - 24  30  30, 

(11) +  29  30  29, 

(12)  x 29  26  29, 

(13)  + 23  29  23, 

(14)  X 22  27  29, 

(15)  { } 29  29, 

(16)  jump  0 29  18, 

(17)  jump  0 0 3, 

(18)  print  22, 

(19)  print  23, 

(20)  jump  28  22  3, 

(21)  stop, 

(22)  -1, 

(23)  0, 

(24)  1, 

(25)  0.001, 

(26)  0.0005, 

(27)  10, 

(28)  —0.1 

Sec.  15.5 

The  two-parameter  group  y-*  y + Cvv  + C2.  If  z = <p(x,  x0,  z0) 
is  a solution  of  the  equation  z'  = f{x,  z)  with  the  initial  condi- 
tion z(x 0)  = z0,  then  the  solution  of  the  given  equation  for  the 
initial  conditions  y(x0)  = y0,  y'{x0)  = y’0>  y"{x0)  = y'0'  is  of  the 
form 

* S 

y = y0  + y'o{x  — X0)  + J <p(<j,  x0,  y'0‘)  da 

*0 

= J'o  + y'o(x  — x0)  + <K*,  xQ,  y'o) 
which  means  a table  of  three  entries  suffices  for 
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Central  differences  40 
Central-force  field  396 
Centre  of  mass  395 
Centripetal  force  328 
Characteristic  equation  238,  274 
Circular  permutation  420 
Circulation  347 
of  a vector  417 
Closed  systems  269 
Code  625 
Coefficient 

Cartesian  3 16 
correlation  557 
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Commutative  operation  390 
Complex  current  173,  175 
Complex  Milbert  space  609 
Complex  linear  space  335 
Complex  number(s)  158,  172 
absolute  value  of  159 
amplitude  of  159 
argument  of  159 
basic  properties  of  1 58 
conjugate  161 
exponential  form  of  166 
logarithm  of  169 
modulus  of  159 

modulus-argument  form  of  159 
phase  of  159 
roots  of  169 

trigonometric  form  of  159 
Component  of  a vector  315 
Compound  event  518 
Computer(s) 
analogue  620 
digital  620,  621 
electronic  digital  6 19ff 
general-purpose  digital  622 
special-purpose  622 
use  of  634 

Computing,  purpose  of  642 
Condition  (s) 
boundary  298 
Cauchy- Riemann  182 
initial  227,  240,  297 
Conditional  extremum  476 

in  the  calculus  of  variations  479 
for  a finite  number  of  degrees  of 
freedom  476 

Conditional  extremum  problems  479 
Conditional  transfer  of  control  instruc- 
tion 630 

Conditionally  convergent  series  9 1 
Conjugate  complex  numbers  161 
Conjugate  harmonic  functions  183 
Constant  (s) 

Boltzmann  58,  537 
Euler's,  96 

Planck’s  288,  537,  599 
variation  of  233 

Continuity  equation  374,  376,  441 
Continuous  function  574 
Continuous  nowhere  differentiable  func- 
tions 611 

Continuously  distributed  quantities  544 
Continuous-mode  machines  620 
Control  unit  622 
Convective  rate  112 
Convergence 
regular  100 
uniform  100 


Convergent  integral  66,  70 
Convergent  series  88,  91 
absolutely  9 1 
conditionally  9 1 
Coordinates,  generalized  15 1 
Correlation  556 
Correlation  coefficient  557 
Correlational  dependence  556 
Coulomb’s  law  356,  400 
Criteria,  similarity  307 
Criterion  ( see  criteria) 

Cross  product  389 
Curl  432 

of  a vector  field  4 16,  42  1 
of  a velocity  field  430 
Current,  complex  173,  175 
Curvature  of  a curve  326 
Curve  (s) 

envelope  of  a family  of  129,  130 

integral  226 
probability  526 
Curve  fitting  49 

graphical  method  of  55 
Cuts,  branch  190 
Cyclic  permutation  420 
Cycloid  467 

D’Alembert's  ratio  test  89 
Decimal  fractions  623 
Decimal  number  system  623 
Decode  625 

Degrees  of  freedom  150,  151 
Del  423,  424 
Delta  function  203,  205 
De  Moivre  formula  167 
Denary  number  system  623 
Density 

distribution  (of  primes)  561 
spectral  574,  583 
Dependence,  correlational  556 
Derivation  of  a function  of  a complex 
variable  180 
Derivative  (s) 
mixed  113 
partial  108,  109,  113 
of  a vector  321,  323 
Determinants  270 
Diagram,  phase  175 
Dido's  problem  480,  482,  484 
Difference  (s) 
central  40 
first-order  40 
second  divided  454 
second-order  40 
tables  and  39 
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Differential 
of  mass  HI 
total  109 

Differential  equation(s)  225ff,  263ff 

first-order  (geometric  meaning  of) 
225 

numerical  solution  of  288 
ordinary  225 
partial  225 
systems  of  265 
Differentiation  operator  211 
Digital  computers  620,  621 

representation  of  numbers  and  ins- 
tructions in  623 
Dimensionality  of  space  334 
Dipole  367 
Dipole  moment  367 
Dirac,  P.A.M.  205 
Dirac's  delta  function  203 
Direct  methods  (calculus  of  variations) 
503 

Dirichlet  function  6 10 
Discontinuities  of  the  first  kind  594 
Discrete-mode  machines  620 
Discrete  spectrum  574 
Dispersion  relations  585,  589 
Distributed  mass  207 
Distribution 
Poisson  540 

alternative  derivation  of  542 
of  primes  56 1 

Distribution  density  of  primes  56 1 
Distribution  function  544 
Divergence 

of  an  electric  field  376 
of  a vector  field  369,  370 
of  a velocity  field  374 
Divergence  theorem  37  1 
Divergent  integral  66,  69 
Divergent  series  88 
Dot  product  319 
Double  integrals  143,  144 
Dummy  index  20 

Economics,  problems  in  428,  450 
Effective  potential  400 
Einstein,  A.  332,  645 
Elastic  stress  tensor  412,  413 
Electromagnetic  field  438 
Electron  tube  126 

Electronic  digital  computers  6 19ff 

Electrostatic  field  356 

Element 

of  mass  141 
storage  62 1 
Elliptic  point  137 
Emde,  F.  14,  645 


Energy 

kinetic  325 

potential  345,  347,  454 
Enthalpy  44 
Entropy  533,  537 

(law  of  nondecreasing  entropy)  538 
of  a pack  of  cards  535 
(principle  of  entropy  increase)  538 
Envelope  of  a family  of  curves  129,  130 
Equation  (s) 

characteristic  238,  274 
continuity  374,  376,  441 
differential  (see  differential  equations) 
225ff,  263ff 
systems  of  265 
Euler’s  462,  465 
alternative  derivation  of  4 17 
finite  28 

first-order  (integrate  types)  229 
homogeneous  linear  230,  237 
Laplace  182,  183,  377 
linear  234,  236 
Maxwell’s  438,  44 1 
nonhomogeneous  linear  23  I 
Poisson  376,  377 
Riccati  293 

second-order  homogeneous  linear 
(with  constant  coefficients)  236 
second-order  nonhomogeneous  linear 
242,  249 

with  variables  separable  229 
Eratosthenes,  sieve  of  562 
Euclidean  basis  315 
Euclidean  space  334 
Euler,  L.  30 1 
Euler's  constant  96 
Euler's  equation  462,  465 

alternative  derivation  of  417 
Euler’s  formula  164,  166 
Euler's  method  289 
Even(s) 

compound  5 18 
independence  of  518 
space  of  151 
Expected  cases  529 

Experimental  results,  mathematical  trea- 
tment of  39ff 

Exponential  form  of  complex  number 
166 

External  memory  unit  625 
Extrapolation 
to  infinity  59 
linear  42 

Extrapolation  problem  39,  58 
Extrema  (see  extremum) 

Extremals  490 
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Extremum 
absolute  476 
conditional  476 

conditional  (in  the  calculus  of  va- 
riations) 479 

conditional  (for  a finite  number  of 
degrees  of  freedom)  476 
necessary  condition  of  460 
Extremum  points  133 
Extremum  problems  131 
Extremum  problems  with  restrictions  488 

Family  of  curves,  envelope  of  129,  130 
Faraday,  M.  440 

Fermat’s  principle  in  optics  491,  492 
Fermi,  Enrico  19 
Field  (s) 

central-force  396 
electromagnetic  438 
electrostatic  356 

magnetic  (and  electric  current)  433 
of  mass  velocity  353 
nonstationary  340 
nonsteady-state  340 
plane  341 
plane-parallel  34 1 
potential  426 
rotational  44 1 
scalar  340,  341 
solenoidal  436 
stationary  112,  340 
steady-state  340 
vector  340 
velocity  35 1 
Field  theory  340ff 
Finite  bicimal  623 
Finite  equations  28 
Finite  motion  398 
First  integral  268 
Floating-point  representation  624 
Flow  line  351,  375 
Fluid,  ideal  305 
Flux  351,  352 
of  a vector  369 
Focal  point  265 
Force,  345 
centripetal  328 
Formula(s) 

Bessel  43 
Euler's  164,  166 
de  Moivre  167 

Newton’s  (for  quadratic  interpola- 
tion) 42 

Ostrogradsky  371 
Poisson’s  539 
Simpson's  17 
Stirling's  82,  83 


FORTRAN  634 
Fourier  analysis  577 
Fourier  integrals  253 
Fourier  operator  589 
Fourier  series  602 
Fourier  transform  589 
Fourier  transformation  573ff,  589 
formulas  of  577ff 
inverse  590 
properties  of  589 
Fraction(s) 
bicimal  623 
decimal  623 

Frank- Kamenetskii,  R.A.  645 
Free  oscillations  180 
Frequencies  (see  frequency) 
higher  605 

Frequency  (see  frequencies) 
carrier  614 

Frequency  modulation  6 14 
Function(s)  211,  459 
analytic  181,  160 
complex-valued  173 
of  a complex  variable  158 
derivative  of  180 
integral  of  184 
concept  of  a 610 
conjugate  harmonic  183 
continuous  nowhere  differentiable 
611 

delta  203,  205 
Dirac's  delta  203ff,  205 
Dirichlet  6 10 
distribution  544 
generalized  6 10 
Green’s  208,  209,  585 
harmonic  182,  378,  428 
implicit  118 

implicit  representation  of  123 
influence  209 
joint  distribution  558 
Lagrangian  500 

parametric  representation  of  122 
piecewise  analytic  611 
point  340 

probability  distribution  515,  516 
rapidly  oscillating  (integration  of) 
84 

rapidly  varying  (integration  of)  73 
related  to  the  delta  function  214 
of  several  variables  108ff 
signum  218 
transfer  586 

of  two  variables  (geometrical  mea- 
ning of)  1 15 
unit-step  2 14 
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Function  space  606 
Functional  456ff,  458,  459 
linear  459 
nonlinear  459 
quadratic  459 
Functional  relationship  556 
Fundamental  frequency  605 
Fundamental  harmonic  605 

Galerkin,  B.G.  504 

method  of  Ritz-Galerkin  504 
Gauss'  theorem  359 
Generalized  coordinates  151 
Generalized  functions  610 
Generalized  Ohm  law  442 
Generalized  vectors  334 
General-purpose  digital  computer  622 
Geometry,  projective  153 
Gibbs  phenomenon  604 
Gradient  34 1 
Gradshteyn,  I.S.  645 
Green,  George  209 
Green's  function  208,  209,  585 
Group,  transformation  590 
Group  velocity  of  light  497 
Gyroscope  407 

Hadamard,  Jacques  S.  158 
Hagedorn,  Rolf  589,  645 
Hamilton,  W.R.  423 
Hamiltonian  operator  423 
Hamming,  R.W.  638,  645 
Harmonic  (s) 

fundamental  605 
higher  605 

Harmonic  analysis  577,  602 
Harmonic  functions  182,  378,  428 
Harmonic  oscillation  173 
Harmonic  series  90 
Higher  frequencies  605 
Higher  harmonics  605 
Hilbert  space  606ff,  607 
HM  9 

Homogeneous  linear  equations  230,  237 
Hyperbolic  point  137 
Hyperplane,  three-dimensional  334 

Ideal  fluid  305 
Ideal  line  153 
Ideal  points  153 
Image  589 
Imaginary  163 
pure  161 

Imaginary  part  of  a complex  number  158 
Imaginary  power  164 
Impedance  177 


Implicit  functions  1 18 

Implicit  representation  of  a function  123 

Improper  integrals  65 

Independence  of  events  518 

Index,  dummy  20 

Inertia  tensor  409 

Inertialessness  625 

Infinite  bicimal  623 

Infinite-dimensional  space  334 

Infinite  motion  398 

Influence  function  209 

Initial  condition  227,  240,  297 

Initial-value  problem  297,  299 

Inner  product  3 19 

Input  parameters  620 

Instructions  (computer)  625 

conditional  transfer  of  control  630 
three-address  625 

unconditional  transfer  of  control  630 
Integers,  positive  171 
Integral  (s) 

Bernoulli  382,  384 

over  a closed  contour  186 

computing  sums  by  means  of  20 

convergent  66,  70 

dependent  on  a parameter  99 

divergent  66,  69,  101 

double  143,  144 

first  (of  a system  of  equations)  268 
Fourier  253 

of  a function  of  a complex  variable  184 

improper  65 

line  143,  346,  417 

multiple  139,  143 

probability  529 

proper  65 

regularly  convergent  100 
Riemann  222 
Stieltjes  221,  222 
triple  143 
volume  140,  141 
Integral  curve  226 
Integral  test,  Cauchy’s  90 
Integral  theorem,  Cauchy's  186 
Integration 
numerical  14 

of  rapidly  oscillating  functions  84 
of  rapidly  varying  functions  73 
Integrator  620 
Intensity  of  electric  field  356 
Internal  memory  unit  625 
Interpolation  58 
linear  41 
quadratic  42 

Newton's  formula  for  42 
Interpolation  problem  16,  39,  58 
Interval,  tabulation  39 
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Inverse  Fourier  transformation  590 
Irrational  numbers  171 
Irreversible  processes  538 
Isobaric  surfaces  343 
Isocline  (s)  227 

of  infinities  227 
of  zeros  227 
Isolated  point  138 
Isoperimetric  problem  480 
Isothermic  surfaces  343 
Isotropy  335 

Iteration  methods  30,  278 

Jahnke,  E.  14,  645 

Joint  distribution  function  558 

Jump  (computer  ) 630 

Kamke,  E.  236,  645 
Kepler,  J.  397 
Keplerian  motions  402 
Kepler’s  first  law  402 
Kepler’s  second  law  397 
Kepler’s  third  law  401,  446 
Kinetic  energy  325 
Kirchhoff  laws  176 
Krainov,  V.  P.  82,  645 
Kronecker  delta,  409 

Lagrange,  J.  L.  233 
Lagrange  method  478 
Lagrange  multiplier  478 
Lagrangian  function  500 
Landau,  L.D.  415 
Languages,  machine  634 
Laplace  equation  182,  183,  377 
Laplacian  operator  377,  426 
Laurent  series  194 
Law(s) 

Archimedean  381 
Bio-Savart  438 

of  compound  probabilities  517 
of  conservation  of  parity  415 
Coulomb’s  356,  400 
generalized  Ohm  442 
Kepler’s  first  402 
Kepler’s  second  397 
Kepler’s  third  401,  446 
Kirchhoff  176 
lenz  440 

of  linearity  209,  211 

Newton’s  400 

Newton’s  second  325,  395 

Newton's  third  395 

of  nondecreasing  entropy  538 

normal  555 

Pascal  379 


Law(s)  (continued) 

of  radioactive  decay  57 
second  (of  thermodynamics)  538 
Torricelli's  7 1 
Layer,  boundary  303,  304 
Least  action,  principle  of  497,  498, 
499,  500 

Least  squares,  methed  of  51,  560 

Leibniz,  G.W.  456 

Leibniz  test  91 

Length  of  a vector  159 

Lenz  law  440 

Level  lines  116 

Level  surfaces  343 

L’Hospital,  G.F.A.  456 

Line(s) 

flow  351,  375 
of  force  350 
ideal  153 
at  infinity  153 
level  116 
oriented  346 
vector  359 
vortex  432 

Line  integrals  143,  346,  417 
Linear  equations  234,  236 
Linear  extrapolation  42 
Linear  functional  459 
Linear  interpolation  4 1 
Linear  mapping  330 
Linear  operations  334 
on  vectors  313 
Linear  operator  211 
Linear  space  334 
Linearity,  law  of  209,  211 
Linearly  dependent  vectors  314 
Linearly  independent  vectors  314 
Local  rate  112 
Localized  masses  207 
Location  (computers)  621,  623 
Logarithms  of  complex  numbers  169 
Loops  (computers)  630 
Losch,  F.  14,  645 
Lyapunov,  A.M.  256 

stable  in  the  sense  of  275 
unstable  in  the  sense  of  277 
Lyapunov  stable  275 
Lyapunov  stability  of  equilibrium  state 
274 

Lyapunov  unstable  277 
Machine  (s) 

continuous-mode  620 
discrete-mode  620 
parallel-mode  625 
Machine  languages  634 
Magnetic  field  and  electric  current  433 
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Manifold,  A-dimensional  15 1 
Mapping,  linear  330 
Mass(es) 
distributed  207 
localized  207 

Mathematical  programming,  theory  of  503 
Mathematical  treatment  of  experimental 
results  39ff 

Maxwell’s  equation  438,  441 
Measure  of  a region  143 
Memory  unit  621 
Method  (s) 

of  Adams  292 
Aitken’s  3 1 

direct  (calculus  of  variations)  503 
Euler’s  289 

graphical  (of  curve  fitting)  55 

iteration  30,  278 

Lagrange  478 

of  least  squares  51,  560 

of  Milne  292 

Monte  Carlo  640 

Newton’s  29 

numerical  13 

perturbation  33 

of  power  series  281 

recalculation  290 

Ritz  504 

of  Ritz-Galerkin  504 
of  Runge-Kutta  292 
of  successive  approximation  30 
of  tangents  29 

of  undetermined  coefficients  35 
Migdal,  A.B.  82,  645 
Milne,  method  of  292 
Minimax  134 
Mixed  derivatives  1 13 
Model-building  309 
Modulation 

amplitude  613 
frequency  6 14 
Modulus 

of  a complex  number  159 
of  elongation  30 1 
of  spectral  density  612 
of  a vector  312,  316 
Modulus-argument  form  of  a complex 
number  159 
de  Moivre  formula  167 
Moment  (s) 

of  external  forces  396 
of  inertia  301,  406 
total  396 

Momentum,  angular  396 
Monte  Carlo  method  640 
Motion  (s) 

Brownian  538 


Motion(  s)  ( continued  ) 

in  a central-force  field  396 
finite  398 
infinite  398 
Keplerian  402 
of  a material  point  324 
precessional  408 
Multidimensional  space  150 
Multidimensional  vector  space  333 
Multiple  integrals  139,  143 
Multiple  roots  17 1 
Multiplication  of  probabilities  517 
Multiplier,  Lagrange  478 
Multiply  connected  region  442 
My§kis,  A.D.  9 


Natural  numbers  17 1 
Natural  oscillations  180 
Neumann,  J.  von  619 
Newton,  I.  397,  456 
Newton’s  formula  for  quadratic  interpo- 
lation 42 

Newton’s  law  400 
Newton’s  method  29 
Newton's  second  law  325,  395 
Newton's  third  law  395 
Nodal  point  265 
Nonasymptotic  stability  257 
Non-Euclidicity  335 
Nonhomogeneous  linear  equations  23 1 
Nonlinear  functional  459 
Nonsimply  connected  region  444 
Nonstationary  field  340 
Nonsteady-state  340 
Norm  607 

Normal,  principal  326 
Normal  law  555 
Normalization  350 
Null  vector  3 13 
Number(s)  17 1 
Avogadro’s  537 

complex  ( see  complex  number)  158,  172 
irrational  17  1 
natural  171 

of  outward  vector  lines  354,  369 
pure  imaginary  161,  163 
rational  17 1 
of  vector  lines  358 
Number  system 
base- 10  623 
base-2  623 
binary  623 
decimal  623 
denary  623 
Number  theory  561 
Numerical  integration  14 
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Numerical  methods  13 
Numerical  series  88 

Open  system  269 
Operand  620 
Operation  (s) 
anticommutative  390 
commutative  390 
linear  334 
Operator (s)  211,  459 
differentiation  211 
Fourier  589 
Hamiltonian  423 
Laplacian  377,  426 
linear  211 

Ordinary  differential  equations  225 
Ordinary  point  138 
Oriented  line  346 
Oriented  surface  351 
Original  589 

Orthogonal  system  of  functions  607 
complete  608 
incomplete  608 
Orthogonality  607 
Oscillation  (s) 
free  180 
harmonic  173 
natural  180 
Osculating  plane  327 
Ostrogradsky  formula  371 
Overdetermined  269 
Overdetermined  system  53 
Overflow  628 

Parabolic  points  138 
Paradoksov,  P.  288,  645 
Parallel-mode  machine  625 
Parameter(s) 
input  620 

integrals  dependent  on  99 
variation  of  233 

Parametric  representation  of  a function 
122 

Parametric  resonance  288 
Parity 

law  of  conservation  of  415 
principle  of  combined  415,  416 
Partial  derivatives  108,  109,  113 
Partial  differential  equations  225 
Pascal  law  379 
Permutation 
circular  420 
cyclic  420 

Perturbation  method  33 
Phase 

of  a complex  number  159 
of  spectral  density  612 


Phase  diagram  175 
Phase  velocity  of  light  497 
Phenomenon,  Gibbs  604 
Piecewise  analytic  functions  6 1 1 
Planck,  Max  498,  645 
Planck’s  constant  288,  537,  599 
Plane,  osculating  327 
Plane  field  341 
Plane-parallel  field  34 1 
Point  (s) 
branch  189 
elliptic  137 
extremum  133 
focal  265 
hyperbolic  137 
ideal  153 
at  infinity  153 
isolated  138 
nodal  265 
ordinary  138 
parabolic  138 
saddle  265 

of  self-intersection  138 
singular  138,  263 
vortex  265 
Point  function  340 
Poisson  distribution  540 
alternative  derivation  of  542 
Poisson  equation  376,  377 
Poisson’s  formula  539 
Polar  vectors  415 
Pole  of  third  order  193 
Positive  integers  171 
Potential  426 
effective  400 

in  a multiply  connected  region  442 
Potential  energy  345,  347,  454 
Potential  fields  426 
Power  325,  347 
imaginary  164 

Power  series,  method  of  281 
Pr&ger,  M.  645 
Precessional  motion  408 
Preimage  589 

Primes,  distribution  of  561 
Principal  axes  of  a tensor  411 
Principal  normal  326 
Principal  value,  Cauchy’s  588 
Principle  (s) 

of  causality  586 

of  combined  parity  4 15,  4 16 

of  entropy  increase  488 

Fermat's  (in  optics)  491,  492 

of  least  action  497,  498,  499,  500 

of  superposition  211 

uncertainty  597,  599 
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Principle(s)  ( continued ) 
variational  49 1 

Probabilities  (see  probability)  5 14 
compound  (law  of)  517 
law  of  compound  517 
multiplication  of  517 
Probability  (see  probabilities). 

theory  of  514ff 
Probability  curve  526 
Probability  distribution  function  5 15,  5 16 
Probability  integral  529 
Problem  (s) 

boundary-value  297,  29 8 
of  calculus  of  variations  457 
conditional-extremum  479 
Dido's  480,  482,  484 
in  economics  428,  450 
extrapolation  39,  58 
extremum  13 1 

extremum  (with  restrictions)  488 
initial-value  297,  299 
interpolation  16,  39,  58 
isoperimetric  480 
spectrum  of  300 
Processes,  autonomous  275 
Product 
cross  389 
dot  319 
inner  319 
scalar  3 19 
scalar  triple  39 1 

vector  (of  two  vectors)  388,  389 
vector  triple  391 
Program  splitting  632 
Programming  628 
Projection  of  a vector  316 
Projective  geometry  153 
Proper  integrals  65 
Pseudo-Euclidean  spaces  335 
Pseudoscalar  416 
Pseudovectors  415 
Punched  cards  624 
Punched  tape  624 
Pure  imaginary  numbers  161 
Pythagorean  theorem  for  space  320 

Quadratic  functional  459 
Quadratic  interpolation  42 
Newton’s  formula  for  42 
Quasi-Euclidean  spaces  335 

Radioactive  decay  57,  514,  539 
law  of  57 

Random  variable  548 
Rate 

convective  1 12 
local  112 


Ratio  test  d’Alembert's  89 
Rational  numbers  17 1 
Rays,  beta  441 
Real  linear  space  335 
Recalculation  method  290 
Region 

multiply  connected  442 
nonsimply  connected  444 
simply  connected  444 
Regular  convergence  100 
Regularly  convergent  integral  100 
Relations,  dispersion  585,  589 
Relationship,  functional  556 
Representation 
floating-point  624 
of  a function  ‘ 
implicit  123 
parametric  122 
Residues  190ff,  194 
Resistance,  anode  126 
Resonance  255 
parametric  288 
Riccati  equation  293 
Riemann  (see  Cauchy-Riemann  conditi 
ons) 

Riemann  integral  222 
Rigid  body,  rotation  of  406 
Ritz,  F.  504 
Ritz  method  504 
Ritz-Galerkin,  method  of  504 
Roots 

of  a complex  number  169 
multiple  17 1 

Rotation  (see  curl)  388,  421,  432 

Rotational  field  441 

Rule 

Simpson's  17 
trapezoidal  15 

Runge-Kutta,  method  of  292 
Ryzhik,  I.  M.  645 

Saddle  (see  minimax)  134 
Saddle  point  265 
Scalar  312 

Scalar  fields  340,  341 
Scalar  product  319 
Scalar  triple  product  391 
Second  divided  difference  454 
Sedov,  L.I.  645 
Self -intersection,  point  of  138 
Semendyayew,  K.A.  20,  645 
Semenov,  N.N.  260 
Series 

absolutely  convergent  91 
conditionally  convergent  9 1 
convergent  88,  91 
divergent  88 
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Series  (continued) 

Fourier  602 
harmonic  90 
Laurent  194 
numerical  88 

Taylor's  (and  extremum  problems)  131 
Sieve  of  Eratosthenes  562 
Signum  function  218 
Similarity  of  phenomena  305 
Similarity  criteria  307 
Simply  connected  region  444 
Simpson’s  formula  17 
Simpson’s  rule  17 
Simulation  309 
Singular  points  138,  263 
Singularity  65 
Sink  369 

Sliding  vector  393 
Sobolev,  S.L.  610 
Solenoidal  fields  436 
Solution  (s) 

adiabatic  variation  of  285 

asymptotically  stable  256 

asymptotically  stable  in  the  sense 

of  Lyapunov  256 

stable  256 

unperturbed  34 

unstable  256 

Source  of  vector  lines  369 
Space 

complex  Hilbert  609 
Complex  linear  335 
k-dimensional  151 
dimensionality  of  a 334 
Euclidean  334 
of  events  15 1 
function  606 
HUbert  606ff,  607,  608 
infinite-dimensional  334 
linear  334 

multidimensional  150 
multidimensional  vector  333 
pseudo-Euclidean  335 
quasi- Euclidian  335 
real  linear  335 
vector  334 

Special-purpose  computers  622 
Spectral  density  574,  583 
modulus  of  6 12 
phase  of  6 12 

Spectrum  (of  a function)  574 
continuous  574 
discrete  574 

Spectrum  rof  a problem  300 
Splitting  program  632 
Stability  (of  computers)  625 
asymptotic  276 


Stability  (continued) 

Lyapunov  (of  equilibrium  state)  274 
nonasymptotic  257 
Stable 

(Lyapunov  stable)  275 
in  the  sense  of  Lyapunov  275 
Stable  solutions  256 
Stationary  field  112,  340 
Statistical  array  534 
Steady-state  field  340 
Step  (in  calculations)  291 
Stieltjes  integral  221,  222 
Stirling's  formula  82,  83 
Stokes'  theorem  422 
Stop  (computers)  628 
Storage  element  621,  623 
Stream  tube  383 
Strength  of  a source  369 
Stress  tensor  412,  413 
Subprogram  634 
Substitution,  address  632 
Successive  approximation,  method  of  30 
Sums 

alternating  24 

computing  (by  means  of  integrals)  20 
Superposition,  principle  of  211 
Surface(s) 

convex  in  the  large  137 
isobaric  343 
isothermic  343 
level  343 
oriented  351 
Symbol,  variation  497 
Symmetric  tensors  408,  411 
System(s) 
autonomous  275 
closed  269 

of  differential  equations  265 
number  (see  number  systems) 
open  269 

overdetermined  53,  269 
underdetermined  269 

Tables  and  differences  39 
Tabulation  interval  39 
Tape,  punched  624 

Taylor’s  series  and  extremum  problems 
131 

Tensor  (s) 

antisymmetric  408,  413 
basic  facts  about  328ff 
elastic  stress  412,  413 
inertia  409 
principal  axes  of  411 
of  rank  m 331 
stress  412,  413 


Index 


656 

Tensor(s)  (continued) 
symmetric  408,  4 1 1 
trace  of  414 
unit  409 

Tensor  analysis  332 
Test 

d’Alembert's  ratio  89 
Cauchy’s  integral  90 
Leibniz  91 
Theorem 

Cauchy’s  integral  186 
divergence  371 
Gauss’359 

Pythagorean  (for  space)  320 
Stokes’  422 
Theory 
field  340ff 
of  mathematical  programming  503 
number  561 
of  probability  514ff 
Thermodynamics,  second  law  of  538 
Three-address  instructions  625 
Top  407 

Torricelli's  law  7 1 
Total  differential  109 
Trace  of  a tensor  414 
Transconductance  126 
Transfer  function  586 
Transform,  Fourier  589 
Transformation 
bell-shaped  597 

Fourier  (see  Fourier  transformation) 
573ff,  589 

Transformation  group  590 
Transient  process  254 
Translator  (computer)  634 
Trapezoidal  rule  15 
T rials 

many  (analyzing  results  of  ) 522 
very  large  number  of  549 
Trigonometric  form  of  a complex  num- 
ber 159 

Triple  integrals  143 
True  vectors  415 

Uncertainty  principle  597,  599 
Unconditional  transfer  of  control  ins- 
truction 630 

Undetermined  coefficients,  method  of  35 
Undetermined  system  269 
Uniform  convergence  100 
Unit 

arithmetic  622 
control  622 
external  memory  625 
internal  memory  625 
memory  621 


Unit-step  function  214 

Unit  tensor  409 

Unperturbed  solution  34 

Unstable  Lyapunov  277 

Unstable  in  the  sense  of  Lyapunov  277 

Unstable  solutions  256 

Vaganova,  A.Ya.  407 
Value,  Cauchy's  principal  588 
Variable,  random  548 
Variance  549 
Variation 

adiabatic  (of  a solution)  285 
of  constants  233 
of  a functional  459 
of  parameters  233 
Variation  principles  491 
Variation  symbol  497 
Vector(s)  312ff 

absolute  value  of  312,  316 
antiparallel  3 14 
area  379 
axial  4 15 

component  of  315 
derivative  of  32 1 
generalized  334 
length  of  159 
linear  combination  of  314 
linear  operations  on  313 
linearly  dependent  3 14 
linearly  independent  3 14 
modulus  of  312,  316 
null  313 
polar  415 
projection  of  316 
scalar  product  of  3 19 
sliding  393 
true  415 
zero  3 13 

Vector  fields  340 
Vector  lines  369 

Vector  product  of  two  vectors  388,  389 

Vector  space  334 

Vector  triple  product  391 

Velocity 

angular  394 
group  (of  light)  497 
p^ase  (of  light)  497 
Velocity  field  351 
Vit4se1&  E.  645 
Volume  ^integral  140,  141 
Vortex  line  432 
Vortex  point  265 

Wave  packet  600 
Weight  52 

Zero  Vector  313 
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